drumz 07-24-04 04:57 PM

Xorg, Nvidia -> black screen of death.
4 Attachment(s)
Motherboard: Tyan Thunder K8W
CPU: AMD Opteron 244
OS: Gentoo, with latest updates.
Video Card: GeForce FX 5700
Kernel: gentoo-dev-sources 2.6.7-r11 (or ANY kernel, I've tried a bunch)
X server: Xorg (latest version provided by gentoo).
Nvidia driver: latest 1.0.6106 via gentoo ebuild.

Symptoms: Issuing the 'startx' command results in the black-screen-o-death and X comsuming all CPU cycles. I have to remotely log in to reboot, which hard locks the machine as soon as X is killed (or killed by hand). This is when the nvidia driver attempts to use the agp built into the kernel. Works fine if card is allowed to drop into PCI mode (but then I get slow desktop refreshes :-( ) Examining /proc/drivers/nvidia/agp/* shows that when in agp mode everything looks good and should be working.

What I've tried: Searched gentoo lists, this list, googled. Tried different kernel versions. Tried disabling things that has worked for others (USB2, multi-CPU support,..). Different kernel boot parametetrs. Turned off IOMMU so I could disable AGP in the kernel and try drivers built-in agp support. Various changes in my xorg.conf file. All of these show the same problem: black-screen-o-death if you try to use agp. <sigh>

What I've attached: I've attached the output from nvidia's bug reporting script when the black-screen-o-death was in effect. I've attached the grub conf file I used. I attached my kernel config file (2.6.7-r11 of gentoo-dev-sources). I also attached my xorg.conf file.

Can someone please either tell me what I'm doing wrong, or tell me it will only work in PCI mode or that there's a bug that needs to be fixed in something first for it to work for me.

A most greatful thanks to anyone providing insight into this issue. I'm on the edge of just accepting PCI mode and living with it.


MightyPenguin 07-24-04 06:00 PM

Re: Xorg, Nvidia -> black screen of death.
First off if you're using a 2.6.x kernel make sure that /usr/src/linux is a symlink that points to it e.g. /usr/src/linux -> /usr/src/linux-2.6.7

It's possible you compiled the nvidia driver with kernel headers that were for another kernel. If you don't want to do this, you can use an option with the nvidia installer to tell it where to look for the kernel headers, just use the --help flag with it to see what that flag is and how to set it.

Also, if you can recompile your kernel, disable io-apic and APIC in general. Also disable ACPI. Those two things made my stability much worse with nvidia drivers in the past, and the serious users I know refuse to use at least ACPI (io-acpi is still maturing).

It does appear this release adds a lot of new features which may account for many of the problem it has. For me, 5336 worked just great.

drumz 07-24-04 06:55 PM

Re: Xorg, Nvidia -> black screen of death.
1. /usr/src/linux is linked to the proper kernel when I install it. And yes, after installing each kernel I've re-emerged nvidia-kernel nvidia-glx and ran opengl-update nvidia.

2. Turning ACPI off causes my system not to see the AGP.

3. I've tried passing the nopapic kernel parameter, no difference.

Executing an 'emerge -p linux-headers' showed that for some reason there was an update/change that should have taken place but didn't. I've run that and am now re-emerging glibc as per the header instructions. I'll then recompile/install my kernel and nvidia drivers and see if this makes a difference. Hopefully it will, thanks for the idea.
4. I didn't check to see if the previous version of the drivers properly used agp, but with them I definitely couldn't turn on ACPI which results in not being able to auto power off the box on shutdown.

drumz 07-25-04 07:32 AM

Re: Xorg, Nvidia -> black screen of death.
Re-emerging the linux-headers and recompiling glibc had no affect. I still have the same black-screen-o-death.

Suggestions anyone?

drumz 07-26-04 06:08 AM

Re: Xorg, Nvidia -> black screen of death.
Ok, I've continued to play with kernel settings, etc. without any change. Any suggestions? Anyone experience this before and solved it (excluding the solutions I've tried that are listed above)?

MightyPenguin 07-26-04 05:17 PM

Re: Xorg, Nvidia -> black screen of death.
Only other things I can think of are:

Make sure you aren't using 4kb stacks in your 2.6.x kernel. (reports of AGP not working if you do have it enabled)

You can try using the driver w/o 3d support by commenting out the "glx" line in your xorg.conf file. (or XFree86.conf).

Not sure if this'll help, but you can try using APM instead of ACPI. It's not near as powerful, but I've not had near as many problems with it.

Overall, I still think this is just something we'll have to wait for the next driver release before it works.

Also, opterons are still bleeding edge, despite what anyone says. Look for bios updates for your motherboard, and any opteron/X86-64 specific kernel patches that haven't yet been applied to 2.6. Oh and you ARE using the x86-64 specific nvidia driver right? Not the generic ia32 one? Just checking.

drumz 07-27-04 06:14 AM

Re: Xorg, Nvidia -> black screen of death.
I'll take a look at the 4k stack issue, but the 6106 drivers are supposed to fix that (and I've seen other's say it works now).

I did at one point try it without the glx turned on. Same problem, startx followed by the black screen.

Only problem with switching to APM is then ACPI is turned off and then it can't find my agp slot.

I agree, I think there's still a bug or something with driver that's causing some issue between the driver, agp and X. Hopefully someone will be able to confirm it's a bug (and what it is) and nvidia can fix it in the next rev (or tell us what the workaround is). I'm happy with the card, and especially my mobo in 64bit mode (fast as greased lightning). Just disappointed I can't get the video card to go 'that extra step' into agp mode.

Thanks for the ideas, much appreciated!

lotheac 07-27-04 09:28 AM

Re: Xorg, Nvidia -> black screen of death.

Originally Posted by MightyPenguin
First off if you're using a 2.6.x kernel make sure that /usr/src/linux is a symlink that points to it e.g. /usr/src/linux -> /usr/src/linux-2.6.7

NO! Have you read the kernel readme?

INSTALLING the kernel:

- If you install the full sources, put the kernel tarball in a
directory where you have permissions (eg. your home directory) and
unpack it:

gzip -cd linux-2.6.XX.tar.gz | tar xvf -

Replace "XX" with the version number of the latest kernel.

Do NOT use the /usr/src/linux area! This area has a (usually
incomplete) set of kernel headers that are used by the library header
files. They should match the library, and not get messed up by
whatever the kernel-du-jour happens to be.

drumz 07-27-04 05:04 PM

Re: Xorg, Nvidia -> black screen of death.
I have tried a few more additional things without any luck. I've tried the 'mm' sources under gentoo. Same black screen. I've tried resetting the bios, same thing.

I'm out of options, so I give. Will have to live with it in PCI mode until either a new driver rev comes out or someone can troubleshoot the issue better than I.


drumz 07-31-04 01:57 PM

Re: Xorg, Nvidia -> black screen of death.
Not one for giving up (call me stubborn), I've done a fresh install of 32bit gentoo on the same hardware using a spare partition.

I have the SAME EXACT PROBLEM using a 32bit os as I do 64bit.

BTW, I've filed a bug report throught email address nvidia supplies for their linux drivers. What are the (realistic) chances of me getting any help from them?

SuLinUX 08-01-04 06:37 AM

Re: Xorg, Nvidia -> black screen of death.
I assume you tried NVAGP=3 ?

drumz 08-01-04 07:15 AM

Re: Xorg, Nvidia -> black screen of death.
Yes, and what happens is it detects the kernel's compiled in/module agp support, tries to use it and I get the black-screen-o-death. Everything reports it's ok, but X is consuming 100%cpu and if you try to kill it it locks the machine up hard.

