PDA

View Full Version : X crashes, no errors.


darkchiild
10-17-02, 05:18 PM
Problem:

After running startx, the screen goes black for about 30 seconds, then a red ansi block appears in the upper left hand corner. After that it's a hard crash. Can't do anything but reboot.

Installation:

I used the tar files to install

NVIDIA_GLX-1.0-3123.tar.gz and
NVIDIA_kernel-1.0-3123.tar.gz

Everything seemed to install successfully, the kernel module loaded properly, etc.

I edited the XF86Config file as specified in the docs, and even played with things a little to no avail (changing resolution, depths, etc).

Errors:

The don't appear to be an error or warning messages in the X log file. I tried using the -logverbose option specified in the docs eg:

XFree86 -logverbose

Not sure if that's what's supposed to do it or not, I'm still fairly new to linux. In any event, it just seems to stop processing half way through. No errors or anything.

Hardware:

Athlon 1800XP
Asus A7V w/ Ali Chipset rev 1006
RH8.0 knl ver: 2.4.18-14
XFree v 2.2 Gnome

BTW, I tried setting the AGP 'fast write' (turbo doesn't exist) in the BIOS to off already. Didn't do the trick.

I'm at a dead end. I've tried everything i can think of I read through the docs, and there doesn't seem to be anything else that applies to me. I scanned the newsgroups, and nothing there... So here I am. Any help anybody can throw my way would be greatly appreciated.

Going to attach X logs.

bwkaz
10-17-02, 07:23 PM
According to your log, you aren't loading glx. I don't know if it will help with this problem, but you might try it.

I think the problem where the log seems cut off is because it isn't getting written out to the actual disk before the lockup (or the reset, if you're hitting reset pretty fast). Try waiting a bit longer (if you're already waiting like 30 seconds or so, then this isn't a good solution), or try hitting Alt-SysRq-S (SysRq is your print screen key, or at least, it should be), waiting a while (20 seconds or so), then Alt-SysRq-B. If the Alt-SysRq-B doesn't reset your box, then the first combination (which is supposed to sync your disks) didn't do its thing either, and you might as well just hit reset.

Which AGP implementation are you using? agpgart, or NvAgp?

Does dropping the AGP rate down to 2x help?

chrono86
10-17-02, 07:42 PM
I GET THE SAME EXACT PROBLEM. Except that my screen doesn't go red, it flashes 3 blues then hangs. My log file doesn't record any errors and it makes no FREAKING sense. I tried EVERYTHING with nothing but a pointlessly reinstalled GNU/Linux system to show for it. Our Hardware isn't that similar i have:

AMD K6-2 350MHz
Geforce 2 MX 400 SDRAM 64MB
VIA 829C*** chipset'
and everything else doesn't matter.

since you are using an athlon processor, it might be that mem=nopentium line you can add, but you might have tried that already. Maybe its the AGPX rate like bkwaz said, although I'm not sure how to find out what rate and agp card is running at (anyone have a tip for that?) oh yeah, also when you ran the "nv" driver before, did XFree86 run? nice to know someone else is sharing my problem. my post was at http://www.nvnews.net/vbulletin/showthread.php?threadid=2742
-rian

darkchiild
10-18-02, 08:42 AM
Well thanks for the suggestions ppl. It turns out that changing the AGP from 4X to 1X in the bios did the trick. I'm able to load Xwindows w/ the nVidia drivers....

Thing is this doesn't strike me as a viable solution to the problem. I don't want to suffer a performance hit when booting into windows for gaming, etc. and I'd rather like to be able to have 4X support in linux as well.

I was thinking if I used NvAGP that may help? Also, I'm not sure how to find out if agpgart is statically compiled into the kernel, etc. It doesn't show with lsmod.

I tried flashing the bios too to rev 1009, but that didn't do any good. I also didn't see anything in the bios after upgrading which mentioned IO recovery time (this was mentioned in the docs for Ali chipsets running rev 1009 bios).

So here are my remaining questions/concerns:

If I AM running agpgart, is there any way to force AGP 1X mode so I don't have to recompile my kernel to use nvagp to set it in the xf86config?

How do I find out what's statically compiled into the kernel? Specifically agpgart? (Remember I'm a stoopid newbie)

Why don't I have this problem when using the default Xwindows drivers (this is more a curiosity than anything else) ?

Are there any possible work arounds so I don't have to use 1X mode at all? (besides buying another motherboard)

Why does Ali have to suck so bad? :mad:

Thanks again to anybody with answers.

chrono86
10-18-02, 08:56 AM
Hey you can dynamically change your agp rate when GNU/Linux load so that it can stay 4x in your BIOS. Here is one little tidbit from the README:

AGP Rate

You may want to decrease the AGP rate setting if you are seeing
lockups with the value you are currently using. You can do so
with the NVreg_ReqAGPRate NVdriver module parameter.

If you are inserting the module manually:

insmod NVdriver NVreg_ReqAGPRate=2 # force AGP Rate to 2x
insmod NVdriver NVreg_ReqAGPRate=1 # force AGP Rate to 1x

If you are using modprobe (/etc/modules.conf):

alias char-major-195 NVdriver
options NVdriver NVreg_ReqAGPRate=2 # force AGP Rate to 2x
options NVdriver NVreg_ReqAGPRate=1 # force AGP Rate to 1x

I'm a Linux newbie too, so don't know how to do any of that stuff, but if you say changing your AGP rate lower works, then it should work for me! This is just one step closer to the solution of my problem, except my motherboard's bios doesn't allow you to change your AGP rate (although i know my agp card is running at 66Mhz, does that mean 1x? anyone?). I'm sure someone who knows about Modules and such will reply with the answer, thanks again!
-rian

bwkaz
10-18-02, 12:47 PM
I'm not sure how to check agpgart, unless you want to look at your kernel-source directory. Do a find /usr/src/linux-<version> -name .config to figure out where the right file is, and open it up in an editor. Assuming /usr/src/linux-<version> matches your running kernel, then if agpgart is compiled into your kernel, you'll find a line that says:

CONFIG_AGP=y

If it says CONFIG_AGP=m instead, then it was compiled as a module. If it says # CONFIG_AGP is not set, then agpgart wasn't compiled -- but I think this is unlikely.

As for why you don't have the problem with the "nv" drivers, I don't know. It's probably something having to do with the fact that "nv" doesn't support 3D, though -- it's probably putting a lot less stress on your AGP chipset. Maybe.

chrono86 -- the way you pass those options to the module, is almost always by editing the /etc/modules.conf file. Pick one of the options NVdriver xxxxxx lines, and add it to /etc/modules.conf. Then run /sbin/depmod -a (to update dependencies), and either reboot or log out of X, remove the NVdriver, and restart X to test. You can check X's log to see which AGP rate it ended up using.

darkchiild
10-18-02, 04:14 PM
I had a look at the file in question, and it appears that agpgart is being loaded as a module. However it still doesn't show up when doing an lsmod (even when doing it from Xwindows in a terminal window).

You'd think I'd be able to see it if it were loaded right? In any event, where might I be able to find that in the slew of startup config files that linux has to disable it? Or how could I even disable it manually?

Chrono86: I saw what you suggested in the docs, and even tried it however, it didn't work. I'm assuming because the kernel is already looking to agpgart for this setting?

Perhaps I should just give up and get a decent motherboard?

darkchiild
10-18-02, 04:51 PM
Well, after playing around with things a little more I discovered that agpgart wasn't being loaded, and X is using the NVdrivers.

For some reason however, X still crashes when I specify 1X mode in the modules.conf file. I checked the X logs, and it did say it was running in 1X mode. Problem is that even after specifying that (and it appears to be working), if I change the AGP mode to 4X in the bios, X still crashes on startup.

I tried loading agpgart as a module too, but that didn't seem to do any better than the nVidia drivers...

It's looking more and more like I'm just going to have to get a new mobo to get things working the way I want. Well, thanks for the help anyhow. At least now I KNOW I'm screwed.

bwkaz
10-18-02, 05:14 PM
Buggy motherboard BIOS perhaps? Maybe enabling 4x is triggering some bug?

chrono86
10-18-02, 05:33 PM
ahh, this is my last question. How do i start the agpgart module (that is only if i have it installed in my kernel) and 2, when you said earlier bwkaz to "...remove the NVdriver..." what exactly do you mean remove? can't i just add those option lines to the /etc/module.conf (hehe i got to know that file today after getting my sound card to work) file and restart my system? thanks again, and hopefully i get my video card running finally.
-rian

bwkaz
10-18-02, 07:07 PM
Yes, you can just restart, that will work fine.

You can load agpgart by doing an /sbin/modprobe agpgart as root. But you'll probably want to make sure X isn't running when you do this, because if it is, then NvAgp has probably been loaded already.

chrono86
10-19-02, 12:43 PM
okay i don't get it. NOW i've tried EVERYTHING. So my first approach was to load agpgart when i start my computer (by the way i have an MVP3 chipset, and i found out that CONFIG_AGP=m and theres a whole lot more CONFIG_AGP's like CONFIG_AGP_AMD and they are all set to yes.) so i edited my modules.conf and i found "alias */dev/nvidia NVdriver" in there, i took that out because i didn't know what that was for i also set the XF86Config-4 Screen section with "Option "NvAgp" "2"". SO i rebooted and i expected agpart to load automatically. When i did a lsmod, agpgart wasn't listed there. I had always got the impression that the nvidia driver for X starts the agp modules (be it NvAgp or AGPGART) automatically, so i thought that is what was going to happen. So i started X, and same problem. IN the log file it said at the end "Failed to verify AGP Usage" also in my kernel's error file it said "BLAH BLAH: AGPGART: FAILED TO USE write combining MTRR" and in the log file for X i noticed it said "(WW) System does not support changing MTRR". So what the hell is MTRR and what does it have to do with blinking blue lights. I figured i can just recompile the driver to use NvAgp instead of AGPGART by putting "make NVdriver BUILD_PARAMS=NOAGPART" but even then i'm still confused about a couple of things:

What was the NVdriver line in the modules.conf originally? was that NvAgp or the NVdriver, and if it was the NVdriver, why does it need to be loaded at startup?

How come agpgart or NVdriver don't show under lsmod when you load them in modules.conf, but do show up when you do modprobe agpgart or modprobe NVdriver?

that's it for now, if someone sees any fault in my original setup please tell me, i really am frustrated and i want to get this thing to work. thanks
-rian

bwkaz
10-19-02, 02:53 PM
The CONFIG_AGP_AMD and family were just selecting which chipset families the agpgart module would support.

MTRRs are Memory Type Range Registers. They're registers on the processor, and they control caching policy and how memory is accessed (uncached, write-through, write-back, or write-combining). If your kernel doesn't have support for them (presumably because your kernel was configured without them for portability), the only thing you might see is slightly reduced performance.

That alias was probably incorrect -- it should probably have been alias /dev/nvidia* NVdriver, with the asterisk after the /dev/nvidia rather than before. It was so that if you use devfs/devfsd to manage your /dev files, the module would load correctly when it was needed, when some application tried to open any file in /dev whose name started with nvidia -- not on boot. The other alias line (alias char-major-195 NVdriver[/b] also isn't so that the driver loads on boot, it's so that the driver loads the first time it's needed (the first time a program sends a request to the nVidia device files -- if you don't use devfs, then the files are always there and this kind of handling works well).

agpgart is only loaded when either a program makes a request to /dev/agpgart (if you aren't using devfs), or when a program tries to open /dev/agpgart (if you are). If you have the NvAgp option set to 2, then the X driver should make its AGP requests to the /dev/agpgart file, and after the first one, the module will be loaded. Just out of curiosity, is there an agpgart.o file in the /lib/modules/<whatever kernel version you're currently running>/kernel/drivers/char/agp directory?

That original line was for NVdriver. Like I said above, it's not being loaded at startup, but it is being loaded whenever it's needed.

For the next question, again, you don't load anything by just putting an alias in modules.conf. That just tells the kernel (basically) when it's needed so that the kernel will load it. When you modprobe stuff, you're explicitly telling the kernel "I want this loaded now".

X is working then, right, using the "nvidia" driver? Or not? Does glxgears give decent frame rates now? (~1000-2000 fps is decent)

chrono86
10-20-02, 09:43 PM
wow thanks for clearing the modules.conf file up for me, i was really confused about that. Alright i guess i got it, i am using DevFS so that is probably why all those special settings are there. I can't check if agpgart.o is in that folder you were saying, but it probably is. As far as i know agpgart works. I tried loading X without AGP, but still nothing worked except i got the same blue flashes and a whole bunch of characters and a hard freeze, haha. I'm guessing the problem is not completely AGP related, i think some of it is my kernel and system setup too (damn mandrake 9). ALthough i am still curious why AGPGART or NvAGP couldn't verify AGP usage, it doesn't make sense. I mean my comptuer recognizes my card as an AGP card when i do a "cat /proc/pci" hmm. There is one interesting thing i noticed though, and this information my be why X isn't working with the nvidia driver. When i do a "modprobe NVdriver" (also it "taints" my kernel whatever that is) so that it makes status entries in the /proc filesystem, when i check the /proc/driver/nvidia/agp/status file it says that AGP is disabled or not currently functioning (those aren't the exact words but a good approximation). What exactly does that mean? and if i do enable it will the nvidia driver for X finally verify AGP usage? Also my other question is, Is MTRR based on how my kernel was compiled or is it a part of my motherboards capabilities? Like if my "system" can't support chaning MTRRs is that a limitaion of my motherboard or my kernel configuration? that is all for now, thanks.
-rian

ps. REPEAT X IS NOT WORKING WITH THE NVIDIA DRIVER YET REPEAT, hehe.

bwkaz
10-20-02, 10:41 PM
Oh, OK. X isn't working then. Right. ;)

The message that says the kernel will be "tainted" should direct you to a web page that explains what that means.

I do know that the "agp disabled" message and the "could not verify AGP" message are related, but I don't know what causes each one, or if one causes the other, or what. It's not that the chipset's unsupported; your Via isn't all that new. Maybe a search through X's source... Hang on a minute... nope, nothing. Must be a message from the nvidia driver instead? I don't know what would cause that though -- no source. :(

Yeah, if you get one to work the other should as well. I just don't know how to do that...

The MTRR stuff is processor- and kernel-specific only. It really doesn't have anything to do with your motherboard. It's probably set up to be disabled because of Mandrake's policy of compiling for Pentium Classics. Those chips didn't have MTRRs, so Mandrake's default kernels won't have support for them either.

You might be able to grab a kernel off www.kernel.org and see if it works better. Maybe... it's a sort of long shot.

chrono86
10-21-02, 04:20 PM
yeah thanks bwkaz, i might just do a LFS, or try recompiling a kernel (although this job seems a little scary since i'm terribly new to Linux). ACTUALLY i think i'll have a try with DEBIAN hehe. i'm sure i can accomplish something. Thanks for all your help, i really appreciated it.
-rian