Originally Posted by netllama
I have a few questions:
0) Does this problem persist with the latest RHEL-4.5 kernel?
1) Can you setup a serial console to capture any kernel messages at the time of the crash?
The serial console captured the same occaisional "NVRM: Xid" messages from the nvidia kernel module as I was seeing in the system logs.
Some of my crashes were caused by the card not sitting securely in the PCIe slot. The little plastic clips that some cases use to hold the cards in place are no match for the weight of the 8800GTX cards.
Moving from RHEL 4.4 to RHEL 4.5 seems to have fixed the remaining crashes.