View Full Version : Applying kernel patch 118844-28 kills Nvidia module

02-21-06, 03:27 PM
Just to let everyone know, I'm new to solaris.

Anyways at school, we just got a shipment of ten ultra 20s and I've been setting them up, well I've been working on just one so far. I used Sun's Update Manager to apply all the patches and then finaly applying the kernel patch 118844-28 and rebooted and found that X failed to start.

I'm used to Linux, so I thought, "Oh yea, it just needs a new Nvidia module" So I download the latest one and run the installer, no errors. Again a reboot and still the same. I look through the dmesg output and no errors, though I noticed that using the latest module, it doesn't output that statement of loading the module. Modinfo says its there.

I did find a forum posting someplace about using update_drv but that doesn't work since that entry is already in /etc/drivers_alias I setup another Ultra 20 with the lastest module, but without the new kernel and it works fine, so I guess its the kernel?

Any ideas?

02-21-06, 08:07 PM
I have the same problem. I tried taking the nvidia* out of the /etc/driver_aliases file, but since I booted with grub, the driver is included in boot_archive. I don't know how I can take that out of the boot_archive. As my last resort, I am downloading the 1/06 Solaris from Sun now to "upgrade" the system. I hope that would fix it.:(

02-22-06, 04:14 AM
I'm as new to Linux, as you are to Solaris, so I have no solutions. However, your mention of solving a problem such as this by removing the nvidia module caught my eye. I have had the same problem updating the kernel in SuSe in the past, so I have been delaying doing this again, until I understood how to deal with it. However, I have never installed the Nvidia driver on any of these installations, because the installer asks for a newer kernel than I have. When I run rmmod, it says that no module even exists, so obviously it could not be removed. This would suggest that something else is the basis of the problem. However I have used the SuSe installation repair function, to repair and reinstall Grub, but this did not solve the problem either. Though you are still seeking a solution for Solaris, perhaps you could shed some light on Linux?

02-26-06, 07:51 AM
The NVidia driver works for me with that kernel, however my machines are the W2100z variety.

Check the driver is loading. E.g,

$ dmesg | grep -i nvidia
Feb 24 16:03:45 pigpen pseudo: [ID 129642 kern.info] pseudo-device: nvidia255
Feb 24 16:03:45 pigpen genunix: [ID 936769 kern.info] nvidia255 is /pseudo/nvidia@255
Feb 24 16:05:51 pigpen pcplusmp: [ID 637496 kern.info] pcplusmp: pci10de,18a (nvidia) instance 0 vector 0x10 ioapic 0x2 intin 0x10 is bound to cpu 1
Feb 24 16:05:51 pigpen pci_pci: [ID 370704 kern.info] PCI-device: display@0, nvidia0
Feb 24 16:05:51 pigpen genunix: [ID 936769 kern.info] nvidia0 is /pci@5,0/pci1022,7455@1/display@0
Feb 24 16:06:22 pigpen pseudo: [ID 129642 kern.info] pseudo-device: nvidia255
Feb 24 16:06:22 pigpen genunix: [ID 936769 kern.info] nvidia255 is /pseudo/nvidia@255

If not, did you reboot like reboot -- -r?

Check that you're using NVidia's driver, not Xorg's nv driver. In the X config, typically /etc/X11/xorg.conf, check that your driver looks something like:

Section "Device"
.___Identifier "Quadro4"
.___Driver "nvidia"
.___VendorName "nVidia Corporation"
.___BoardName "NV18GL [Quadro4 NVS AGP 8x]"
.___BusID "PCI:9:0:0"
.___Option "CursorShadow" "true"

In your Xorg log file (typically /var/log/Xorg.0.log) you should see something like:

(II) LoadModule: "nvidia"
(II) Loading /usr/X11/lib/modules/drivers/nvidia_drv.so
(II) Module nvidia: vendor="NVIDIA Corporation"
.___compiled for 4.0.2, module version = 1.0.8178
.___Module class: X.Org Video Driver

The Xorg log file will also detail various errors, identified by a leading "(EE)" of a log message.


02-28-06, 09:42 AM
Happened to me early last month when installing 118844-27.

I too installed the latest drivers, with no success.

Resolution was found in the NVidia drivers/sx86/video/readme.txt file located on the supplemental cd that shipped with the Ultra 20.

The configuration reboot after the kernel patch assigned the board to the next device instance. This is not the exact error reported in the readme, but the effect was the same.

This bug is not fixed (per readme) but workaround was to remove the "nvidia" entries from /etc/path_to_inst then performing a configuration reboot.

X came up fine after reboot.