Boinc + Primegrid + Cuda = Lockups
I just got CUDA working for Primegrid in Boinc distributed computing and every hour or so it will freeze. I tried to log it with startx - --logverbose 6 but when I run ubuntu with that option Boinc will not show any tasks or attached projects. I ssh'ed in while it was frozen to generate a bug report.
Also when I first got Cuda for primegrid working I had my 3 GTX480's overclocked (through the BIOS) to 900Mhz. With them overclocked, the computer would lock up much more often, and would always show a message in the kern.log:
[Hardware Error]: Machine check events logged
Now that I reflashed the factory BIOS with stock clocks, I no longer ever see this message.
Nvidia-bug-report is the report generated while sshing in while the computer was frozen with stock BIOS.
Nvidia-bug-report1 is the report generated while Oc'ed to 900Mhz.
Ive also noticed while running at stock speed's I rarely see a Eq overflow but Oc'ed I get constant waits, Xid's and Eq overflows.
There is a segfault for projectm in the log files as well. I was testing stability although projectm does crash by itself from time to time.
It also seems like the 271 driver series does not like my overclock. 270 series worked fairly reliably at these clocks. (900Mhz, 2Ghz Ram).