Fedora 34: NVIDIA kernel module missing falling back to nouveau - GPU ignored

Fedora 34: NVIDIA kernel module missing falling back to nouveau - GPU ignored

My 10 years old Linux desktop is running Fedora, with the NVIDIA Geforce GTX 660 Ti graphics card. For the driver, I've been using RPMFusion's akmods tool to have the kernel modules built at boot time in case.

After an update last week, Fedora greeted me with nvidia kernel module missing falling back to nouveau today.

1st attempt: Update again

I wasn't sure about the issue and ran an update again, just to rule out outdated package versions immediately.

One thing I noticed during dnf upgrade was a warning for the dracut configuration.

/etc/dracut.conf:add_drivers+="hid-logitech-hidpp"

dracut: WARNING: <key>+=" <values> ": <values> should have surrounding white spaces!
dracut: WARNING: This will lead to unwanted side effects! Please fix the configuration file.

2nd attempt: Manually build the NVIDIA driver

Following a suggestion on the Fedora forum, I've tried to force build the drivers. The akmod command actually works, it is the `dracut` error which causes the problem.

$ sudo akmods --force && sudo dracut --force
Checking kmods exist for 5.15.14-100.fc34.x86_64           [  OK  ]
/etc/dracut.conf:add_drivers+="hid-logitech-hidpp"

dracut: WARNING: <key>+=" <values> ": <values> should have surrounding white spaces!
dracut: WARNING: This will lead to unwanted side effects! Please fix the configuration file.

/etc/dracut.conf: line 54: add_drivers: command not found

dracut is a tool to generate a Linux boot image (initramfs), where the NVIDIA kernel driver needs to be a part of.

I've modified the dracut.conf file in 2015, when I ran into the problem that the wireless Logitech keyboard did not work with dmcrypt on boot.

# OWN
add_drivers+="hid-logitech-hidpp"

3rd attempt: dracut add_drivers needs trailing spaces

For an unknown reason, the file got modified to strip away the trailing white spaces in the value. It seems that last week's dracut update now detects that properly and throws an error.

The updated configuration fixes the error. :-)

$ sudo vim /etc/dracut.conf

-add_drivers+="hid-logitech-hidpp"
+add_drivers+=" hid-logitech-hidpp "

$ sudo dracut --force
$ sudo echo $?
0

This fixed the driver warning but unfortunately fails after reboot again.

4th attempt: Syslog and NVIDIA driver ignoring GPU

$ less /var/log/messages

Jan 16 18:16:42 imagine kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 234
Jan 16 18:16:42 imagine kernel: NVRM: The NVIDIA GeForce GTX 660 Ti GPU installed in this system is#012NVRM:  supported 
through the NVIDIA 470.xx Legacy drivers. Please#012NVRM:  visit http://www.nvidia.com/object/unix.html for more#012NVRM
:  information.  The 495.46 NVIDIA driver will ignore#012NVRM:  this GPU.  Continuing probe...
Jan 16 18:16:42 imagine kernel: NVRM: No NVIDIA GPU found.
Jan 16 18:16:42 imagine kernel: nvidia-nvlink: Unregistered the Nvlink Core, major device number 234

The log message means that the Geforce GTX 660 Ti graphics card is officially old, and needs a legacy driver. Thanks to RPMFusion's wiki, this just needs a different package being installed. It causes problems with existing drivers; therefore review the system packages, purge all 495 versions, and install the 470 version.

$ sudo dnf install xorg-x11-drv-nvidia-470xx akmod-nvidia-470xx

Last metadata expiration check: 0:39:42 ago on Sun 16 Jan 2022 05:52:38 PM CET.
Error: 
 Problem 1: package kmod-nvidia-5.15.14-100.fc34.x86_64-3:495.46-1.fc34.x86_64 requires nvidia-kmod-common >= 3:495.46, but none of the providers can be installed
  - package xorg-x11-drv-nvidia-470xx-3:470.94-1.fc34.x86_64 conflicts with xorg-x11-drv-nvidia provided by xorg-x11-drv-nvidia-3:495.46-1.fc34.x86_64
  - conflicting requests
  - problem with installed package kmod-nvidia-5.15.14-100.fc34.x86_64-3:495.46-1.fc34.x86_64
 Problem 2: package kmod-nvidia-5.15.12-100.fc34.x86_64-3:495.46-1.fc34.x86_64 requires nvidia-kmod-common >= 3:495.46, but none of the providers can be installed
  - package xorg-x11-drv-nvidia-470xx-3:470.94-1.fc34.x86_64 conflicts with xorg-x11-drv-nvidia provided by xorg-x11-drv-nvidia-3:495.46-1.fc34.x86_64
  - package akmod-nvidia-470xx-3:470.94-1.fc34.x86_64 requires nvidia-470xx-kmod-common >= 3:470.94, but none of the providers can be installed
  - conflicting requests
  - problem with installed package kmod-nvidia-5.15.12-100.fc34.x86_64-3:495.46-1.fc34.x86_64
(try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages)

Purge the 490xx driver packages.

$ rpm -qa | grep nvidia

$ sudo dnf remove *nvidia*495*
Dependencies resolved.
========================================================================================================================
 Package                                   Architecture Version                  Repository                        Size
========================================================================================================================
Removing:
 akmod-nvidia                              x86_64       3:495.46-1.fc34          @rpmfusion-nonfree-updates        22 k
 kmod-nvidia                               x86_64       3:495.46-1.fc34          @rpmfusion-nonfree-updates         0  
 kmod-nvidia-5.14.11-200.fc34.x86_64       x86_64       3:495.46-1.fc34          @@commandline                     45 M
 kmod-nvidia-5.14.16-201.fc34.x86_64       x86_64       3:495.46-1.fc34          @@commandline                     45 M
 kmod-nvidia-5.14.9-200.fc34.x86_64        x86_64       3:495.46-1.fc34          @@commandline                     45 M
 kmod-nvidia-5.15.12-100.fc34.x86_64       x86_64       3:495.46-1.fc34          @@commandline                     45 M
 kmod-nvidia-5.15.14-100.fc34.x86_64       x86_64       3:495.46-1.fc34          @@commandline                     45 M
 nvidia-settings                           x86_64       3:495.46-1.fc34          @rpmfusion-nonfree-updates       4.6 M
 nvidia-xconfig                            x86_64       3:495.46-1.fc34          @rpmfusion-nonfree-updates       192 k
 xorg-x11-drv-nvidia                       x86_64       3:495.46-1.fc34          @rpmfusion-nonfree-updates        56 M
 xorg-x11-drv-nvidia-cuda-libs             x86_64       3:495.46-1.fc34          @rpmfusion-nonfree-updates       136 M
 xorg-x11-drv-nvidia-kmodsrc               x86_64       3:495.46-1.fc34          @rpmfusion-nonfree-updates        26 M
 xorg-x11-drv-nvidia-libs                  i686         3:495.46-1.fc34          @rpmfusion-nonfree-updates        77 M
 xorg-x11-drv-nvidia-libs                  x86_64       3:495.46-1.fc34          @rpmfusion-nonfree-updates       336 M
Removing unused dependencies:
 egl-wayland                               x86_64       1.1.7-1.fc34             @updates                          58 k
 libglvnd-gles                             i686         1:1.3.3-1.fc34           @updates                          97 k
 libglvnd-opengl                           i686         1:1.3.3-1.fc34           @updates                         137 k

Transaction Summary
========================================================================================================================
Remove  17 Packages

Freed space: 863 M
Is this ok [y/N]: 

Install the 470xx driver packages.

$ sudo dnf install xorg-x11-drv-nvidia-470xx akmod-nvidia-470xx 
Last metadata expiration check: 0:48:02 ago on Sun 16 Jan 2022 05:52:38 PM CET.
Dependencies resolved.
========================================================================================================================
 Package                                  Architecture  Version                  Repository                        Size
========================================================================================================================
Installing:
 akmod-nvidia-470xx                       x86_64        3:470.94-1.fc34          rpmfusion-nonfree-updates         27 k
 xorg-x11-drv-nvidia-470xx                x86_64        3:470.94-1.fc34          rpmfusion-nonfree-updates         19 M
Installing dependencies:
 egl-wayland                              x86_64        1.1.7-1.fc34             updates                           32 k
 libglvnd-gles                            i686          1:1.3.3-1.fc34           updates                           31 k
 libglvnd-opengl                          i686          1:1.3.3-1.fc34           updates                           44 k
 nvidia-settings-470xx                    x86_64        3:470.94-1.fc34          rpmfusion-nonfree-updates        1.7 M
 xorg-x11-drv-nvidia-470xx-kmodsrc        x86_64        3:470.94-1.fc34          rpmfusion-nonfree-updates         24 M
 xorg-x11-drv-nvidia-470xx-libs           i686          3:470.94-1.fc34          rpmfusion-nonfree-updates         23 M
 xorg-x11-drv-nvidia-470xx-libs           x86_64        3:470.94-1.fc34          rpmfusion-nonfree-updates        143 M

Transaction Summary
========================================================================================================================
Install  9 Packages

Total size: 210 M
Total download size: 107 k
Installed size: 479 M
Is this ok [y/N]: 

The wiki also notes that future Fedora releases (35+) use the new 495xx driver, which requires my setup to stay on the 470xx legacy driver for the time being.

Conclusion

If your graphics card is too old, the NVIDIA driver will log an unsupported error during boot time into syslog, but this detail is not provided with the nouveau fallback error message popup. Make sure to check the syslog first.