nvidia troubleshooting application
Is anyone interested in helping write an nvidia troubleshooting application? My thoughts are that this applicatoin would try to diagnose potential stability issues in your system and suggest ways to fix them.
For instance, it could check your CPU model and kernel version and ensure that (if you have an Athlon), mem=nopentium has been set.
Anyone interested in helping? I was thinking that it would just be perl or shell scripts, called by a main program. I would host it on sourceforge and give CVS commit access to anyone interested. As well, I would build it in RPM format for easy install.
If interested, please email me at firstname.lastname@example.org.
Take a look at a preliminary version here:
Note that the above RPM is still an alpha release. Right now it checks for:
disabled Fast Writes
Athlon & mem=nopentium
Shared IRQs for nvidia card
Umm... I assume this is just a shell script, right?
Why package it all up in an RPM then?
Could you possibly put both versions up (RPM and the shell script itself) for people that are interested (like myself) but refuse to use RPM distros because they oftentimes do not work right? :D
And one other thing -- you don't need mem=nopentium anymore if you're running any kernel 2.4.19 or later. These kernels disable the "advanced speculative caching" feature on the Athlon XP and MP chips at bootup. I believe RH8's 2.4.18-X kernels also had the patch, but I haven't looked for sure (doesn't affect me -- see above about "refusing to use RPM distros" ;)).
You may still need it if you have a normal Athlon or a Duron, I don't know about that.
Sure - I've put the .tar.gz up. It's a simple set of perl scripts right now. It's not really that smart, either. The goal is to collect all of the troubleshooting knowledge as a set of scripts that can go through the motions. I can add the bit about the kernel version and how mem=nopentium is not required past 2.4.18
Couple of other minor things, then:
In the fastwrites script, if fast writes aren't supported by the card, they'll never be enabled by the driver. So I'm not sure that entire section is needed -- unless there's a good reason you put it there? Same thing is happening with the SBA script.
In the same script, you might want to check /proc/driver/nvidia/agp/host-bridge as well, since not all AGP bridges support fast writes. Same with the SBA script -- check host-bridge.
You might want to check that the /proc/driver/nvidia directory even exists. If not, then say something about "the nvidia kernel module needs to be loaded before you can run this test" or something.
Is the multiple-IRQs script depending on the nVidia IRQ being XT-PIC? Because some systems have IO-APIC enabled (which generally gives more IRQs, though not always), and that field might be IO-APIC-edge or IO-APIC-level instead of XT-PIC on those systems. The exit condition in that script also seems a little weird...
If my Perl skills were anything more than basic, I think the best way to write this would be using reusable function calls. I'm hoping that I can get enough of something for someone with some actual Perl experience to refactor it to something more useful. :)
I've also realized that it's quite difficult to guess the format of the /proc entries from computer-to-computer. This will probably take a bit of work to make complete.
In addition, I don't seem to have a definitive list of what "could" cause instability and what "will" cause instability. Right now I'm using "WARNING" and "ERROR", respectively, but I need more input to know if I'm using them correctly.
With any luck, this might be possible to do. ;)
Thanks for the suggestions,
|All times are GMT -5. The time now is 01:30 AM.|
Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright ©1998 - 2014, nV News.