Tub 11-19-06 06:22 PM

system crashes when upgrading from 6600GT to 7600GT
System: amd64 3000+ 939, PCIe, asus/nforce mobo

my trusty Asus EN6600GT board recently died, so I was forced to buy a new one, and chose an EN7600GT due to it's two DVI ports.
Since then, stability problems struck me - the system often completely locks up to the point where even SysReq+s / u won't work any more.
None of the problems I encountered where present with the old gfx board, and I didn't change the system configuration in between.

the system often locks up when starting a new game in UT2004, but it also randomly happened in NWN and on several other occasions.
A less severe crash happened several times when scrolling in kolourpaint: the screen freezes, the mouse can still be moved, but no gui is responding. ssh-login is still possible, dmesg saying:
NVRM: Xid (0001:00): 6, PE0000 0404 ffffffff 0000fdec ffefefef 00086400

another message I managed to capture:
NVRM: Xid (0001:00): 6, PE0000 0404 ffffffff 0000fdf0 ffffffff 00086400

I've tried reinstalling the drivers and using different driver versions:

nvidia-drivers-1.0.8776 -> locks up
nvidia-drivers-1.0.9629 -> locks up
nvidia-drivers-1.0.9742 -> no deadlocks during my first round of UT, but several other applications stopped working properly, so it's not an option and I didn't test further.

I've updated my kernel from 2.6.17 to 2.6.18, which didn't help.

I've tried disabling TwinView, didn't help.

I've tried creating a fresh xorg.conf without custom changes, didn't help.

Although I use a single core system, I've tried the pci=nommconf thing, which didn't help.

I've removed the ~/.nvidia-settings-rc (or whatever it was called while it still existed), which didn't help.

I set agp=off in my grub.conf (using PCIe it was worth trying), but it didn't help.

I set Render = "false" in my xorg.conf, which didn't help.

It has to be a gfx driver issue, as an overnight compile session finished without problems. I'm out of ideas.

attachment as requested. As the computer completely locks up most of the time, or at least locks up the X server, this log was generated after a clean boot.

If you need me to provide more information, please let me know how.

netllama 11-19-06 06:30 PM

Re: system crashes when upgrading from 6600GT to 7600GT
I have a few questions:
0) What problems were you having with 1.0-9742?
1) Have you verified that you're using the latest BIOS for the motherboard?
2) Are you able to setup a serial or netconsole to capture any kernel messages at the time of the crash?


whig 11-19-06 06:34 PM

Re: system crashes when upgrading from 6600GT to 7600GT
Just a guess: monitor temperatures - card getting too hot?

Tub 11-19-06 07:50 PM

Re: system crashes when upgrading from 6600GT to 7600GT

thanks for the hint with netconsole. Although netcat -u -l -p 6666 didn't capture the messages, wireshark did. After re-assembling, here it is:

Unable to handle kernel paging request at ffff81043b952454 RIP:
[<ffffffff8821c887>] :nvidia:_nv006647rm+0x35f/0x45a
PGD 8063 PUD 0
Oops: 0000 [1]
Modules linked in: netconsole nvidia
Pid: 14672, comm: ut2004-bin Tainted: P 2.6.18-gentoo-r2 #4
RIP: 0010:[<ffffffff8821c887>] [<ffffffff8821c887>] :nvidia:_nv006647rm+0x35f/0x45a
RSP: 0000:ffffffff80616be0 EFLAGS: 00010296
RAX: 00000000ffffffff RBX: ffff81003b952000 RCX: ffff81003b952450
RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000030
R10: 0000000000000020 R11: 0000000000000000 R12: 0000000000000006
R13: 0000000000000030 R14: ffff8100309f0000 R15: ffff8100309f0000
FS: 00002b2f6bc9dfd0(0000) GS:ffffffff80654000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff81043b952454 CR3: 0000000039b64000 CR4: 00000000000006e0
Process ut2004-bin (pid: 14672, threadinfo ffff810029e7a000, task ffff81002f26cb20)
Stack: ffffffff80616c50 ffff81003b952000 ffff8100309f0000 0000000000000001
0000000000000030 ffffffff8821bf69 ffff81003ae34bc0 ffffffff80616cd0
ffff81003ae34ac0 0000000000000002 ffff81003ae34000 ffffffff8823f06b
Call Trace:
<IRQ> [<ffffffff8821bf69>] :nvidia:_nv006662rm+0x9d/0xf2
[<ffffffff8823f06b>] :nvidia:_nv001020rm+0xf7/0x184
[<ffffffff8823eec5>] :nvidia:_nv001019rm+0x139/0x1e8
[<ffffffff88238e05>] :nvidia:_nv001149rm+0x4f/0xbc
[<ffffffff8823b27e>] :nvidia:_nv001104rm+0x108/0x2bc
[<ffffffff8823b606>] :nvidia:_nv001215rm+0x3e/0x7e
[<ffffffff8823980a>] :nvidia:_nv001225rm+0x3e/0x52
[<ffffffff88247ea6>] :nvidia:_nv001005rm+0x8e/0xf6
[<ffffffff88248031>] :nvidia:_nv000966rm+0x2d/0x7c
[<ffffffff881eaa62>] :nvidia:_nv003692rm+0x112/0x4fa
[<ffffffff881e132f>] :nvidia:_nv003702rm+0x8b/0xd2
[<ffffffff8802de3c>] :nvidia:_nv001806rm+0x92/0xae
[<ffffffff8803233f>] :nvidia:rm_isr_bh+0x53/0x56
[<ffffffff88279e3f>] :nvidia:nv_kern_isr_bh+0x16/0x18
[<ffffffff8022dd36>] tasklet_action+0x46/0x80
[<ffffffff8022dfff>] __do_softirq+0x4f/0xb0
[<ffffffff8020a7fc>] call_softirq+0x1c/0x30
[<ffffffff8020c22c>] do_softirq+0x2c/0x90
[<ffffffff8020c1e2>] do_IRQ+0x72/0x90
[<ffffffff80209f59>] ret_from_intr+0x0/0xa

Code: 83 7c 81 08 00 75 04 ff ca eb f3 83 fa ff b8 00 00 00 00 0f
RIP [<ffffffff8821c887>] :nvidia:_nv006647rm+0x35f/0x45a
RSP <ffffffff80616be0>
CR2: ffff81043b952454
<0>Kernel panic - not syncing: Aiee, killing interrupt handler!
can you narrow down the problem from that?

I will now try a bios upgrade.

edit: performed the bios upgrade from 1001 to 1014, it didn't help. ut2k4 crashed again with a very similar message. the pid and some adresses were different, but the general message and the call trace were identical.

@whig: no, I don't think it's heat related. It doesn't just crash after x minutes of 3d-action. There are visible patterns when it crashes and when it doesn't, and they don't depend on the time I played. For example, I can usually play ut2k4 for 20 minutes without problems (usually, not always), but there's a high chance it'll freeze during the next map change.

Tub 11-20-06 09:40 AM

Re: system crashes when upgrading from 6600GT to 7600GT

as for your question 0, I reinstalled the beta drivers today to give you a more useful answer.

Non-working applications are:
wine + snes9x
wine + VBA
wine + diablo2 with the opengl renderer

native opengl-applications seem to work, 64 bit (ut2k4) or 32 bit (nwn). Other wine-applications that don't use openGL seem to work as well. The error message is always something like this:

X Error of failed request: GLXBadDrawable
Major opcode of failed request: 128 (GLX)
Minor opcode of failed request: 29 ()
Serial number of failed request: <varies>
Current serial number in output stream: <varies>

This issue was already discussed in another thread in here and in a wine bugzilla entry, but no solution was found yet, so using the beta drivers doesn't solve my problems.

on a positive note, the beta drivers haven't locked the system yet, so I'll keep them as the lesser of two evils for now. Any help to get the non-beta drivers working is still appreciated.

Tub 11-20-06 01:18 PM

Re: system crashes when upgrading from 6600GT to 7600GT
the beta drivers still didn't crash, but they locked up the system for a couple of seconds during ut2k4. dmesg says as follows. As dmesg lacks timestamps, I cannot tell which of those have been printed during/after the temporary lockup and what was older.


PCI: Setting latency timer of device 0000:01:00.0 to 64
NVRM: loading NVIDIA UNIX x86_64 Kernel Module 1.0-9742 Tue Nov 7 09:45:02 PST 2006
NVRM: Xid (0001:00): 1, Channel 00000002 Method 00000000 Data 80068006
NVRM: Xid (0001:00): 30, L1 -> L0
NVRM: Xid (0001:00): 13, 0002 beef3097 00004097 00001efc 007f0002 00000002
NVRM: Xid (0001:00): 30, L1 -> L0
NVRM: Xid (0001:00): 6, PE0002 1680 023b779c 0011b714 c4f80000 00100000
NVRM: Xid (0001:00): 30, L1 -> L0
NVRM: Xid (0001:00): 13, 0002 beef3097 00004097 00001710 00000000 00040000
NVRM: Xid (0001:00): 30, L1 -> L0
NVRM: Xid (0001:00): 6, PE0002 180c 0f0c0f13 001cb0d8 026fe5d8 026fe5d8
NVRM: Xid (0001:00): 30, L1 -> L0
NVRM: Xid (0001:00): 13, 0002 beef3097 00004097 00001a28 37464003 00000002
NVRM: Xid (0001:00): 30, L1 -> L0
NVRM: Xid (0001:00): 13, 0002 beef3097 00004097 00001efc 25006000 00000002
NVRM: Xid (0001:00): 30, L1 -> L0
NVRM: Xid (0001:00): 1, Channel 00000002 Method 00000000 Data 31415930
NVRM: Xid (0001:00): 30, L1 -> L0
..this sounds like I'm doing nothing but playing ut2k4 all day.. oh dear.

Tub 11-20-06 04:37 PM

Re: system crashes when upgrading from 6600GT to 7600GT
becoming desperate..

found an old kororaa livecd which worked fine on my old gfx board. It's using driver version 8178. No idea if that version is just too old for the 7600GT, but this time it failed even before starting the X server.

although /dev/nvidia0 exists and the nvidia module is loaded, dmesg says:

NVRM: rm_init_adapter(0) failed
NVRM: RmInitAdapter failed! (0x40:0x0:1369)

xorg concludes:
NVIDIA:could not open the device file /dev/nvidia0 (Input/output error).

do you have any clue what might be wrong with my system? I'd prefer a honest "sorry, no idea, you're screwed" to the silence.

netllama 11-20-06 04:50 PM

Re: system crashes when upgrading from 6600GT to 7600GT
1.0-8178 didn't support your GPU. The problems with wine apps not starting in 1.0-9742 is now a known bug.

