Go Back   nV News Forums > Linux Support Forums > NVIDIA Linux

Newegg Daily Deals

Reply
 
Thread Tools
Old 10-19-02, 05:04 PM   #1
Neoprene
Registered User
 
Join Date: Oct 2002
Posts: 13
Exclamation rh7.3 kernel 2.4.18-17.7.x.athlon.rpm and NV 3123 crashes hard

I never had any probelms installing NV drivers until now (OK one exception, the hardcoded 64 meg mem limit).
After up2date'ing my kernel in rh7.3 from 2.4.18-10 (where 3123 was working fine) to 2.4.18-17.7.x and then first trying to rpm -Uvh from rpm -rebuild src.rpm's, (which used to work fine with 2.4.18-10) I first cannot install the .src.rpm drivers NOR uninstall them.
I then download the .tar's and try to "make" them which "seems" to work, BUT upon reboot I get serious lockup (with Caps Lock and Scroll Lock LED's blinking on keyboard) and sometimes a spontaneous reboot when trying to boot kernel image 2.4.18-17.7.x.
If I boot with the "linux.bak" (2.4.18-10) it all works fine....
....until I try to reinstall the 3123 again. Then the "linux.bak" (2.4.18-10) will not work, but at least the systems returns to runlevel 3 and does not crash.

Here's my attempt to "compile" the NV driver with various log-files at the moment of crash:

[root@w1 NVIDIA_kernel-1.0-3123]# make >textfile
In file included from nv-linux.h:75, from nv.c:14:
/lib/modules/2.4.18-17.7.x/build/include/linux/highmem.h: In function `bh_kmap':
/lib/modules/2.4.18-17.7.x/build/include/linux/highmem.h:20: warning: pointer of type `void *' used in arithmetic
In file included from nv-linux.h:75, from os-interface.c:25:
/lib/modules/2.4.18-17.7.x/build/include/linux/highmem.h: In function `bh_kmap':
/lib/modules/2.4.18-17.7.x/build/include/linux/highmem.h:20: warning: pointer of type `void *' used in arithmetic
In file included from nv-linux.h:75, from os-registry.c:14:
/lib/modules/2.4.18-17.7.x/build/include/linux/highmem.h: In function `bh_kmap':
/lib/modules/2.4.18-17.7.x/build/include/linux/highmem.h:20: warning: pointer of type `void *' used in arithmetic
[root@w1 NVIDIA_kernel-1.0-3123]#
(ouch!)
[root@w1 NVIDIA_kernel-1.0-3123]# cat textfile
echo \#define NV_COMPILER \"`cc -v 2>&1 | tail -1`\" > nv_compiler.h
cc -c -Wall -Wimplicit -Wreturn-type -Wswitch -Wformat -Wchar-subscripts -Wparentheses -Wpointer-arith -Wcast-qual -Wno-multichar -O -MD -D__KERNEL__ -DMODULE -D_LOOSE_KERNEL_NAMES -DNTRM -D_GNU_SOURCE -DRM_HEAPMGR -D_LOOSE_KERNEL_NAMES -D__KERNEL__ -DMODULE -DNV_MAJOR_VERSION=1 -DNV_MINOR_VERSION=0 -DNV_PATCHLEVEL=3123 -DNV_UNIX -DNV_LINUX -DNVCPU_X86 -I. -I/lib/modules/2.4.18-17.7.x/build/include -Wno-cast-qual nv.c
cc -c -Wall -Wimplicit -Wreturn-type -Wswitch -Wformat -Wchar-subscripts -Wparentheses -Wpointer-arith -Wcast-qual -Wno-multichar -O -MD -D__KERNEL__ -DMODULE -D_LOOSE_KERNEL_NAMES -DNTRM -D_GNU_SOURCE -DRM_HEAPMGR -D_LOOSE_KERNEL_NAMES -D__KERNEL__ -DMODULE -DNV_MAJOR_VERSION=1 -DNV_MINOR_VERSION=0 -DNV_PATCHLEVEL=3123 -DNV_UNIX -DNV_LINUX -DNVCPU_X86 -I. -I/lib/modules/2.4.18-17.7.x/build/include -Wno-cast-qual os-interface.c
cc -c -Wall -Wimplicit -Wreturn-type -Wswitch -Wformat -Wchar-subscripts -Wparentheses -Wpointer-arith -Wcast-qual -Wno-multichar -O -MD -D__KERNEL__ -DMODULE -D_LOOSE_KERNEL_NAMES -DNTRM -D_GNU_SOURCE -DRM_HEAPMGR -D_LOOSE_KERNEL_NAMES -D__KERNEL__ -DMODULE -DNV_MAJOR_VERSION=1 -DNV_MINOR_VERSION=0 -DNV_PATCHLEVEL=3123 -DNV_UNIX -DNV_LINUX -DNVCPU_X86 -I. -I/lib/modules/2.4.18-17.7.x/build/include -Wno-cast-qual os-registry.c
ld -r -o Module-linux nv.o os-interface.o os-registry.o
ld -r -o NVdriver Module-linux Module-nvkernel
size NVdriver
text data bss dec hex filename
894487 55476 52396 1002359 f4b77 NVdriver
NVdriver installed successfully.

(not really)

(I then downloaded the 2960 drivers which gave me the same results.)

The /var/log/XFree86.0.log last lines are:
...
(II) Module fbdevhw: vendor="The XFree86 Project"
compiled for 4.2.0, module version = 0.0.2
ABI class: XFree86 Video Driver, version 0.5
(II) LoadModule: "glx"
<end>

The /var/log/messages ends with :
...
Oct 18 18:10:45 w1 insmod: Warning: loading /lib/modules/2.4.18-17.7.x/kernel/drivers/video/NVdriver will taint the kernel: non-GPL license - NVIDIA
Oct 18 18:10:45 w1 insmod: See http://www.tux.org/lkml/#export-tainted for information about tainted modules
Oct 18 18:10:45 w1 insmod: Module NVdriver loaded, with warnings
Oct 18 18:10:45 w1 kernel: nvidia: loading NVIDIA NVdriver Kernel Module 1.0-2960 Tue May 14 07:41:42 PDT 2002
Oct 18 18:12:18 w1 syslogd 1.4.1: restart.
...
GLX install reports no issues.
My system is an A7V333, 1800+, GF3-Ti200-128MB, nothing OC'd, rh7.3,

Last edited by Neoprene; 10-20-02 at 07:16 PM.
Neoprene is offline   Reply With Quote
Old 10-20-02, 01:46 PM   #2
Neoprene
Registered User
 
Join Date: Oct 2002
Posts: 13
Default https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=73733

https://bugzilla.redhat.com/bugzilla...g.cgi?id=73733

Another oops from Nvidia.

Fix this please.
Neoprene is offline   Reply With Quote
Old 10-20-02, 03:33 PM   #3
bwkaz
Registered User
 
Join Date: Sep 2002
Posts: 2,262
Default

Every error message you have quoted above is not, in actuality, an error message. Every last one of them is merely a warning.

And that bugzilla entry isn't relevant either, it's merely a catch-all for every nVidia-related bugreport that RedHat gets. It's not an "oops from nVidia". It's a generic bug that RedHat points people to to say that they don't support the nVidia drivers, at all, because of the nature of those drivers (and the fact that RH can't change them as they see fit, like they do to everything else *cough* the kernel *cough*). Can you tell that I don't like some things they do? -- of course, I can't blame them for having this entry, as they can't see nVidia's source just like everybody else can't, but I don't like the way they patch the kernel willy-nilly, the way they made some deep patches to gcc that stopped MPlayer (and the Linux kernel, for a while) from compiling properly back around RH 6.something, and a couple others. Then they blame those (legitimate) bugs on the fact that the users are trying to use closed-source drivers...

Anyway, the problem might be that you need to be modprobe'ing an NVdriver that was compiled against whatever kernel-source exactly matches the kernel version you're running. You said you up2date'd your kernel. Did you also up2date kernel-source? Did you reboot into the new kernel (and make sure the kernel-source configuration matched the running kernel, see the Building NVIDIA_kernel for RedHat 8.0 thread) before attempting to recompile the NVdriver? (This is something that I'm not sure is in the documentation, but you might easily need to be running the target kernel to be able to compile a correct NVdriver. I know that's what I always do, and it works -- but then, I do a lot of things differently from a lot of people, so that might have something to do with it as well. )

As a side note, this kind of time is when booting right to a GUI is a really bad idea. You need to be able to boot your updated kernel to get your drivers to work, but if you need your drivers working to be able to boot the updated kernel, you'll have problems. That's why I never set any distro to boot to a GUI -- although you can alleviate problems by just setting it to not boot to a GUI the first time the new kernel boots, that would work.

Where is /lib/modules/2.4.18-17.7.x/build pointing? ls -l it to see. It should be pointing at the root of your 2.4.18-17.7.x source tree.

I'm curious, what did RedHat fix between 2.4.18-10 and 2.4.18-17.7.x that you need?

Quote:
NVdriver installed successfully.

(not really)
It was actually installed successfully. Because those errors you saw aren't errors, they're warnings. The compiler and the make utility can't possibly know whether or not a certain piece of code might cause lockups when in use (although if LEDs are blinking, that's not a lockup -- start X once from runlevel 3, text mode, to see what the problem really is), all they can do is tell whether the thing built without errors (which it did) and whether the cp command that put it in the right /lib/modules/<whatever> directory succeeded (which it did). So it says "installed successfully" (which it was -- problems in the code or in the configuration aren't attributable to the installation, which is just the copying process).

Well, anyway, I do hope you can find some better errors so I can help (and I hope that once you do find them, I can indeed help). And I don't want you to take any of this personally; how could you possibly know about a lot of it?
__________________
Registered Linux User #219692

Last edited by bwkaz; 10-20-02 at 03:42 PM.
bwkaz is offline   Reply With Quote
Old 10-20-02, 05:10 PM   #4
Neoprene
Registered User
 
Join Date: Oct 2002
Posts: 13
Default

Thanks for your thoughtful reply, bwkaz.
I have installed every new Linux driver from NV for the last ~ 3 years w/o incident, so I've been used to not digging to deeply in this.
The newer kernel gets rid of some error messages supposedly from my ATA133 drive: " hda: dma_intr: dma dUmMy=0x51 { DriveReady SeekComplete Error }
hda: dma_intr: error=0x84 { DriveStatusError BadCRC }" which some people on Bugzilla reported to have wiped out a drive but for me no problems except the annoyance. Also a persistent message saying "kmod: failed to exec /sbin/modprobe -s -k scsi_hostadapter, errno = 2" after removing an Adaptec 2906 card disappeared with the kernel update. Earlier kernel updates included drivers for my ATA raid and who knows what else.

Yes, "up2date" did install the kernel-source .rpm along with the new kernel.

[root@w1 root]# ls -l /lib/modules/2.4.18-17.7.x/build
lrwxrwxrwx 1 root root 36 Oct 18 03:48 /lib/modules/2.4.18-17.7.x/build -> ../../../usr/src/linux-2.4.18-17.7.x

<LEDs are blinking, that's not a lockup> I don't know what to call a non-responsive keyboard. The only button that worked was the Reset.

"modprobe NVdriver" and "lsmod" returns the NVdriver correctly.
I use CTRL+x and enter linux 3 to get to runlevel 3 to check loading of modules et cetera..
After doing this and editing "load GLX" and "nvdriver" back I still get a reboot on "init 5" after a few colored screen flashes.

Last edited by Neoprene; 10-20-02 at 05:39 PM.
Neoprene is offline   Reply With Quote
Old 10-20-02, 07:27 PM   #5
bwkaz
Registered User
 
Join Date: Sep 2002
Posts: 2,262
Default

"nvdriver"? That shouldn't appear anywhere in your config file... Would you mind posting it -- the version that's stopping keyboard response?

While the LEDs are going at it, does Ctrl-Alt-Backspace do anything (like, for example, kill off X)? The reason I was saying it wasn't a lockup is because your LEDs are controlled by the kernel, not the keyboard or whatever. So something is still working right, because something has to tell the keyboard controller to turn on and off the LEDs. But whether or not it's responding to any keypresses is another story...

What about Alt-SysRq-K? (SysRq should be your print screen key)

Were you trying to startx while in runlevel 3? If so, can you get to somewhere where you can post the log file (/var/log/XFree86.0.log) from that attempt?
__________________
Registered Linux User #219692
bwkaz is offline   Reply With Quote
Old 10-20-02, 07:39 PM   #6
Neoprene
Registered User
 
Join Date: Oct 2002
Posts: 13
Default

The config is the /etc/X11/XF86Config-4 . Driver "nvidia" , not "nvdriver" sorry.

The keyboard thingy only happened a couple of times.

As I posted earlier the /var/log/XFree86.0.log ends at:
(II) LoadModule: "glx"

and /var/log/messages ends with:
Oct 18 18:10:45 w1 kernel: nvidia: loading NVIDIA NVdriver Kernel Module 1.0-2960 Tue May 14 07:41:42 PDT 2002


The same result happened with 3123.
Neoprene is offline   Reply With Quote
Old 10-20-02, 10:11 PM   #7
bwkaz
Registered User
 
Join Date: Sep 2002
Posts: 2,262
Default

Well you probably don't want to hear this, but I'm out of ideas...

As a last-ditch effort, maybe backing down on the AGP rate might help? See if you can change it in your BIOS or something.
__________________
Registered Linux User #219692
bwkaz is offline   Reply With Quote
Old 10-20-02, 10:17 PM   #8
Neoprene
Registered User
 
Join Date: Oct 2002
Posts: 13
Default

I guess NVidia will "update" their drivers before long.
Meanwhile I have a spare 100 gig W-D JB drive that I'm installing rh8.0 on.
I guess I'll avoid udating to that kernel with rh8.0
Neoprene is offline   Reply With Quote

Old 10-22-02, 08:45 PM   #9
Neoprene
Registered User
 
Join Date: Oct 2002
Posts: 13
Default

Red Hat 8.0 with updated (athlon) kernel 2.4.18.-17.8.0 and the 3123 nvidia drivers works.
The 2.4.18-17.7.x in Red Hat 7.3 is a NO GO.
Neoprene is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 08:45 AM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.