I've been having some serious stability issues with my 8800 GTS whenever I run games (ie - Worlf of Warcraft via WINE) under Linux (yes, I've tried everything suggested at http://www.nvnews.net/vbulletin/showthread.php?t=58498
already, and have gotten no relief from any of them). I suspect that it's a hardware issue, though the card I have in was just returned to me via RMA from the manufacturer. I've swapped in a spare video card (8600GTS) and the issues no longer occur.
I've tried with the nvidia drivers included in my distribution (Ubuntu 8.04), the most recent release drivers from nvidia.com, and the most recent beta drivers, and all have the exact same symptoms: after playing for a while (this timeframe can vary from 30m to several hours before I see any issues), X will lock up solid for about 10 seconds, then recover like nothing ever happened. Every couple of minutes, this will repeat, until at one point X either crashes completely or just hangs permanently and will not come back. I've been running system monitoring tools to verify that this is not caused by heat, and the GPU sensor never registers above 155` F/68` C, which seems to be ok according to the sources I can find.
When these hangs and crashes occur, various things get printed to my syslog. The most recent occurrence printed this:
[71225.692615] NVRM: Xid (0001:00): 6, PE0004
[71225.700282] NVRM: Xid (0001:00): 12, COCOD 00000004 beef5097 00005097 0000145c 00043458
[72079.798392] NVRM: Xid (0001:00): 8, Channel 00000004
[72153.505988] NVRM: Xid (0001:00): 12, COCOD 00000004 beef5097 00005097 0000145c 00043458
I've looked around and don't seem to see anyone experiencing exactly the same symptoms I'm having. That, in addition to the 8600 with the exact same drivers working perfectly fine, leads me to believe that it's a hardware issue.
Just in case, I've attached the output from nvidia-bug-report.sh - I realize that this is kind of a long shot, but if anyone has any idea off the top of their head what this might be other than a hardware failure, I'd appreciate anything - I've really come to dislike this manufacturer's RMA procedure.