PDA

View Full Version : Problems with GeForce 4 on Mdk 9.0


Newton
12-05-02, 02:05 PM
Hi,
I've been trying to start XWindows with the NVidia drives (kernel and glx) without success.
I've tried both the RPM packages and the source files. Everything installs or compiles without problems and the NVidia module is loaded on kernel...
However when I execute startx I receive the following error messages:
(EE) NVIDIA(0): Failed to initialize the NVidia kernel module!
(EE) NVIDIA(0): xxx Aborting xxx
(EE) Screen(s) found, but none have a suitable configuration.

Can anyone give some guesses on what I should try next?? :confused:

My screen is a SAMSUNG SyncMaster 753DFX.

Thank you

bwkaz
12-05-02, 10:24 PM
Does running an /sbin/modprobe NVdriver as root before you startx help? If so, then it looks like the problem is with the automatic module loading setup, probably the alias in modules.conf. But try it first.

If it does work, then what does test -c /dev/.devfsd ; echo $? tell you?

Newton
12-06-02, 05:07 PM
Hi,
thank you for the suggestions.:p
The NVdriver is loaded - apparently without problem except of a "kernel tainted" warning - and remains loaded after the X server abort...
[so, your first sugestion, the modprobe NVdriver run without problems]
Regarding your second suggestion, the file .devfs also exists [the output of test -c /dev/.devfs; echo $? is 0]

I've looked at modules.conf and also at modules.devfs and both have a line "alias /dev/nvidia* NVdriver". As modules.devfs includes modules.conf this means that the systems tries to load NVdriver two times? Can this cause an error?

Finally I include some lines from XFree86.0.log that may give any clue to the experts as well as I am attaching the complete file:

(**) NVIDIA(0): Depth 8, (--) framebuffer bpp 8
(==) NVIDIA(0): Default visual is PseudoColor
(==) NVIDIA(0): Using gamma correction (1.0, 1.0, 1.0)
(--) NVIDIA(0): Linear framebuffer at 0xC0000000
(--) NVIDIA(0): MMIO registers at 0xCE000000
(EE) NVIDIA(0): Failed to initialize the NVdriver kernel module!
(EE) NVIDIA(0): *** Aborting ***
(II) UnloadModule: "nvidia"
(II) UnloadModule: "vgahw"
(II) Unloading /usr/X11R6/lib/modules/libvgahw.a
(EE) Screen(s) found, but none have a usable configuration.

Can I increase the verbosity level of startx or debug the loading of X to obtain any further information that could help solving the problem?

Thanks

Newton
12-06-02, 05:18 PM
I forgot to attach the XFree86.0.log file...

antonius_r3
12-06-02, 07:49 PM
Hi, I'm a newbie in Linux.
And I just installed a Linux Mandrake 8.1 and dual boot it with my current windows.

I tried to change my resolution from the CONTROL CENTER > HARDWARE, but I couldn't find my Video Card listed.

By the way, This is the spec of my computer :

Athlon XP 1800+
ASUS A7V333
512 MB DDR-2700
Gainward GeForce 4 TI 4400
Sound Blaster Audigy

I've tried numerous time to install the NVIDIA driver for linux, either by using RPM or Tar.Gz. It all come into success.

I've read the previous message to try test the Driver, so I did tried.
I typed : test -c /dev/.decfsd;echo $?
and the result is 1.

My questions are :
1. What does this test do? and what is the result telling us?
2. If the result tells that the nvidia driver is not installed, how do I fix it? because I've installed numerous time without any error.

I tried to change my video card, but I couldn't. Can anyone please kindly advice me on this one?

Thank you so much in advance.

bwkaz
12-06-02, 11:18 PM
Originally posted by Newton
The NVdriver is loaded - apparently without problem except of a "kernel tainted" warning - and remains loaded after the X server abort... OK, so that seems to not be the problem at least...

I've looked at modules.conf and also at modules.devfs and both have a line "alias /dev/nvidia* NVdriver". As modules.devfs includes modules.conf this means that the systems tries to load NVdriver two times? Can this cause an error? I doubt it. I don't have that alias in modules.devfs, so you might want to take it out, but I really don't think that would cause something like this...

With the system the way it currently is (with either one or both aliases there), but with the NVdriver not loaded, what does ls /dev/nvidia0 come back with? It *should* load the NVdriver module, then list the /dev/nvidia0 device, so check lsmod after you do it as well.

Those XFree86.0.log errors all point to something wrong with the kernel module's load, or something like that...

Can I increase the verbosity level of startx or debug the loading of X to obtain any further information that could help solving the problem? You can try startx -logverbose 99, but I have a feeling that won't give you anything more. It might end up being worth a shot, though.

Is there anything in either the output of dmesg, or in /var/log/messages, that might give a clue as to why the kernel module won't initialize?

bwkaz
12-06-02, 11:39 PM
Originally posted by antonius_r3
I tried to change my resolution from the CONTROL CENTER > HARDWARE, but I couldn't find my Video Card listed. Yep. That's because Mandrake doesn't understand how to set it up properly. ;) If it had open-source drivers that Mandrake could legally distribute themselves, then the distro would understand this card, but as it is, the default "nv" driver will not work with the GF4 series cards (although I'm not sure about the GF4MX ones), so they don't list it for one of your choices.

I've read the previous message to try test the Driver, so I did tried.
I typed : test -c /dev/.decfsd;echo $?
and the result is 1. /dev/.decfsd or /dev/.devfsd? Was the c a typo when you posted, or when you ran the command? If it was a c when you ran the command, then that is exactly what I would expect to see, but if it was a v and still printed 1, then something is a bit strange.

1. What does this test do? and what is the result telling us? Maybe I should have gone into this from the beginning. All it's doing is testing whether the /dev/.devfsd file (which is a hidden file under /dev) is a character device-type file. If it's there, it's almost assuredly a character-device file, so maybe the test is a little overboard, but it doesn't really matter. The thing is, if that file is a char device, then the system currently has devfs, the device filesystem, mounted on /dev, and they are using devfs and the associated user-space daemon to manage their device files.

Now, this wouldn't make any difference at all, except for a couple of things about devfs -- if the module that services a device file isn't currently loaded into the kernel, then the device file won't exist under devfs (with the older way of managing the device files, they were always there, even if you never had the applicable hardware installed). The other thing is that the vast majority of devfsd installations have it set up so that when a nonexistant file is opened under /dev, the devfsd daemon sees that, and runs /sbin/modprobe with the device file as the module name to load. What this means is that if a program (like XFree86's "nvidia" driver) tries to open /dev/nvidia0 or /dev/nvidiactl, but the NVdriver kernel module isn't loaded, those files won't exist, and won't be openable. But devfsd will see that open request, and then run an /sbin/modprobe /dev/nvidia0 (or whatever the file is). If the alias /dev/nvidia* NVdriver is in the modules.conf file, then modprobe will understand that this specific request should be satisfied by loading the NVdriver kernel module. So it does that, and the /dev/nvidia0 (and /dev/nvidiactl) devices magically appear. So now, the program that opened them can continue, just as if nothing had happened.

The other side of this all is that if you use the alias that a non-devfs system would use (alias char-major-195 NVdriver), but still run devfs, then the proper module will never be loaded. Device files have a major and a minor number, and these are unique. All nVidia device files, for example, have major number 195, and if you create one manually, it has to have major number 195 (check the documentation for mknod if you're interested in how to do that, but it should never be needed). So if you were to manually create /dev/nvidia0, not by loading NVdriver, then nothing would be registered to handle that device, and when a program tried to communicate with the hardware through that file, the kernel (not devfsd) would realize this. It would then take the major number, and modprobe it -- the end result is an /sbin/modprobe char-major-195. So if you have an alias char-major-195 NVdriver in your modules.conf, the correct backend code will be loaded, and the program can continue. However, before any of this can happen, the device file has to exist. And if you use devfs, it won't (the /dev directory gets mostly blown away on every reboot).

So, the reason I tell people to run that command is to find out whether they're running devfs or not. Because if they aren't (if it prints 1), then they need to use alias char-major-195 NVdriver in their modules.conf. If they are (if it prints 0), then they need to use alias /dev/nvidia* NVdriver in their modules.conf file.

If you aren't sure whether the drivers are installed, you can grab the nv_check.sh script that I've improved on a bit from here (http://3dguios.resnet.mtu.edu/nv_check.sh). It may be gone in a couple weeks, though (3 weeks of Christmas break, and the network connection will be down for that time), and again around May (but then it'll be permanent -- I'll graduate). So grab it soon if you want it. ;)

Run it with sh nv_check.sh from the directory you download it to.

Newton
12-08-02, 04:33 AM
Hi,
I've followed your sugestions and the checking of dmesg gave the clue for the problem:
In /var/log/messages I found the following lines -
nvidia: Can't find an IRQ for your NVIDIA card!
nvidia: [Plug & Play OS] should be set to NO
nvidia: [assign IRQ to VGA] should be set to YES
NV0: isr request failed 0xfffffff0

Then, in my BIOS, I've set Plug&Play to NO (what is a little bit annoying if we have a double boot machine with Windows and Linux) and set my "Allocate IRQ to PCI VGA" to yes.
However I don't understant why this last flag is necessary as the graphics card is on the AGP and not on a PCI slot...
On my next reboot I will check if both flags are really necessary...

Anyway the problem is solved. Thank you again for your help

Newton

bwkaz
12-08-02, 07:48 AM
You can leave PnP OS set to no when Windows boots. That flag is pretty much one of the most mis-named flags in existence -- it does not control anything regarding PnP at all. All it does is, if it's set to yes, then your BIOS doesn't initialize all the hardware (and doesn't assign resources to everything) when you boot. If it's set to no, then it does initialize everything and assign all the resources itself.

The thinking is that if you have a PnP OS, then it can do the BIOS's job and assign resources itself. However, this isn't done by any OS except Windows, and then only 98 (95, maybe, but I don't think it did it well), Me, 2K, and XP. NT4 wouldn't do it.

That's usually the first thing I do after building (or buying -- but usually building, lately) a system, is turn off PnP OS. I figure it's better to have one entity initializing everything than to have two different databases of which device has what for resources.

The other setting, "allocate IRQ to PCI VGA", is slightly misnamed as well. On most motherboards that I've seen that have that setting, the PCI part is left out, just like in the error in dmesg. Of course, the AGP bus is just a sub-bus (logically anyway -- not physically) of the PCI bus anyway, which is why all the AGP video cards have PCI IDs assigned to them, so maybe that's what they meant. In any case, glad you got it working.