PDA

View Full Version : Diagnosing lockups


jesup
07-29-06, 10:50 AM
I'm seeing freezes and lockups when using 3D modes and in video playback - in 3D games (such as Rome Total War) it tends to freeze for a few minutes then continues; in other cases (videos from YouTube/etc) it locks and never comes back.

Any good ways to diagnose the cause of the freeze/lockup? I assume (since it never locks when used in 2d mode unless playing videos) that it's videocard related, though it could be (I suppose) audio related.

AMD 3500+, Gigabyte NF4 SLI MB, 1GB, single BFG 7600GT OC (at default speeds), Creative Audigy 2 ZS. Tried NVTray and used the Nvidia CP to check how fast it would OC - it came back considerably higher than the defaults (I forget the number). I tried to underclock to 525/700 it as a test, but when I hit Test the frequencies get way bumped up.

squall_leonhart
07-29-06, 12:02 PM
what are the amp ratings on the 12v rail of the PSU.

jesup
07-29-06, 02:33 PM
what are the amp ratings on the 12v rail of the PSU.

Antec TruePower2.0 380Watt ATX12V
+3.3V 28A
+5V 35A
+12V1 16A
+12V2 16A
+5V SB 2A
-12V 1A

GlowStick
07-29-06, 02:55 PM
I would start by running memtest86 on your ram.

The lockups during flash video (youtube) is the 'oddball' that strays away from a video card related lockup in my mind.

jesup
07-29-06, 11:42 PM
Ran memtest86 for 6+ hours (20+ iterations). No errors.

Note that the Nvidia drivers get involved in video decode...

GlowStick
07-30-06, 12:45 AM
Ran memtest86 for 6+ hours (20+ iterations). No errors.

Note that the Nvidia drivers get involved in video decode...
Yeah, but flash video dose not use overlays etc, it just draws it like any ohter window (honestly I like it better that way) so the nvidia drivers would play as much part as looking at anyother thing.

However, it definatly could be a card problem, may want to call bfg and talk to them about a rma.

jesup
07-30-06, 01:35 AM
No overlays in flash, but the video decoding accelleration should be active (I assume).

Unfortunately, the only easy way to verify if it's the card is to try a different one - unless someone knows any tricks.

squall_leonhart
07-30-06, 03:12 AM
actually both flash and shockwave are 3d accelerated now.

jesup
07-31-06, 10:41 AM
So, back to the original question - any good way to diagnose what's going on? Or even to verify that it's the video card at all? I downloaded all 500+MB of 3dMark06; I've run it once with no obvious lockups (though I wasn't watching for the entire test). Final score was 29xx or so, with no work done to minimize other things running.

jesup
08-07-06, 10:37 PM
Well, it's not looking good - Downgraded to 84.48, and I still have random lockups, at least in Rome-TW and Darwinia (demo).

If I'm lucky, it recovers. If not, it may let me switch out (I did that with Rome-TW, and I was ok, until I switched back into it, and then when I tried to switch (Alt-Tab) out, the monitor started resetting every 2-10 seconds, displaying black inbetween resets (sometimes on reset you could see video for a fraction of a sec).

Darwinia locked every few seconds, and finally blue-screened - all before getting past the intro.

I'll keep testing it out, but it's not looking good. Any good 3D tests I can run continuously?

jesup
08-10-06, 11:20 PM
I wonder if I've found the issue with my BFG 7600GT OC:

I checked the BFG specs page; the power supply requirement is:
A 350W PCI Express compliant system power supply (with 12V current rating of 20A or more)

My case is an Antec Sonata, which has a TruePower 2.0 380W power supply:
Dual 12V rails, each 16A max

So if I read it correctly, the PS may be under-speced for the videocard (even though it has 32A of 12V). I never would have imagined that. Or do you add them? Lots of other reviews I've seen imply that you do add them (i.e. both are used to supply different pins of the PCI-e slot).

Renzo
08-11-06, 12:01 AM
I wonder if I've found the issue with my BFG 7600GT OC:

I checked the BFG specs page; the power supply requirement is:
A 350W PCI Express compliant system power supply (with 12V current rating of 20A or more)

My case is an Antec Sonata, which has a TruePower 2.0 380W power supply:
Dual 12V rails, each 16A max

So if I read it correctly, the PS may be under-speced for the videocard (even though it has 32A of 12V). I never would have imagined that. Or do you add them? Lots of other reviews I've seen imply that you do add them (i.e. both are used to supply different pins of the PCI-e slot).
You read it wrong. Those amperage ratings are ok (totalling 32A opposed to the requirement of 20A) even if it's divided between two rails.

What CPU do you have (edit: ah I see now, which codename; newcastle, winchesteter, venice, san diego? If memtest86+ (this one (http://www.memtest.org/)) didn't fail I'd start checking it with prime95 (http://www.mersenne.org/freesoft.htm) using max heat/power consumption mode for over an hour. If both memtest and prime95 won't generate errors/freezes then your system is "probably" stable and other things might cause the problem.

There are some cpu bugs in the different stepping A64s that might cause freezing if the bios workaround isn't used. (errata 94 and errata 123).

jesup
08-11-06, 07:17 AM
From SiSoft Sandra:

Model : AMD Athlon(tm) 64 Processor 3500+
Speed : 2.21GHz
Model Number : 3500 (estimated)
Performance Rating : PR3317 (estimated)
Generation : G8
Name : M2F-DH Athlon 64 (K8 Venice/San Diego) 90nm 1.8-2.8GHz 1.45-1.55V
Revision/Stepping : 2F / 2 (10D)
Stepping Mask : DH-E6
Core Voltage Rating : 1.400V
L2 On-board Cache : 512kB ECC Synchronous, Write-Back, 16-way set, 64 byte line size

Renzo
08-11-06, 08:31 AM
E6-stepping doesn't have either of the bugs. Then the problem must be somewhere else, perhaps temps or even bios.

jesup
08-11-06, 08:41 AM
Temps are fine - CPU typically around 30-35C when these lockups occur. GPU temp is fine, around 50C normally, probably no higher than 55C. GPU fan stays in silent mode unless I start a 3D app.

BIOS is (I think) F10, which is the latest for that MB (Gigabyte NF4 SLI). Bought about 3 months ago new at microcenter. I switched back to 91.31 last night (haven't tried 3d apps yet).

Is there a good sample stress-test to run to try to show for sure where the problem lies? Games aren't always the best since they might have bugs or interact poorly with a new driver. BFG has a lifetime warrantee if it's their problem (and not an NVidia driver bug).

The symptoms in lockup (screen freezes but audio continues for a while or long while, then everything un-freezes), and I noticed last time I could Alt-Tab out (and the CPU monitor showed that CPU use had dropped to ~0 at the time the freeze started) indicates to me it's not processor - it's game(s), driver, or GPU.

Renzo
08-11-06, 12:26 PM
Is there a good sample stress-test to run to try to show for sure where the problem lies? Games aren't always the best since they might have bugs or interact poorly with a new driver. BFG has a lifetime warrantee if it's their problem (and not an NVidia driver bug).
I have already given a link to the memtest/prime95 to be used for testing for memory/motherboard and the cpu itself.

I guess you can use RTHDRIBL (http://www.daionet.gr.jp/~masa/archives/rthdribl_1_2.zip) for stressing the GPU to the maximum heat/power consumption.

The symptoms in lockup (screen freezes but audio continues for a while or long while, then everything un-freezes), and I noticed last time I could Alt-Tab out (and the CPU monitor showed that CPU use had dropped to ~0 at the time the freeze started) indicates to me it's not processor - it's game(s), driver, or GPU.
I actually have bumped into problems like this. With 6800GT/VIA KT800pro I had similar problems really seldom (once in 3 months or less), so that the screen would freeze for 15-20 seconds like the system hangs totally but it would continue to run after that small period.

-> Reason unknown.

The same things happened to me with 7800gtx/dfi nf4 ultra-d quite often and it was clearly due to dropping core clocks of GTX since the performance degraded by certain level. I installed VF900-cu to the GTX and I've seen zero hangs like that anymore.

-> Problem solved, that's why I thought about temps in the first place.

jesup
08-20-06, 10:03 AM
I have already given a link to the memtest/prime95 to be used for testing for memory/motherboard and the cpu itself.

I guess you can use RTHDRIBL (http://www.daionet.gr.jp/~masa/archives/rthdribl_1_2.zip) for stressing the GPU to the maximum heat/power consumption.

I tried that. Heated up the GPU (not that much though; circa 60C). No problems at all. Pulled the Audigy2ZS - it tended to not lock up as fast, but it still locked up. (Note: case was now open, but idle GPU heat only dropped about 2, maybe 3C.) Lockups were random/rare in things like Half-life, playing videos, etc, but would happen FAST in the Darwinia demo.

However, after FINALLY getting Nvidia control panel to let me modify the GPU clock settings (had to install NTune before it worked), I found that down-clocking the 7600GT to "standard" clock rates (it's a BFG 7600GT OC) solved the problem. Darwinia now runs with 0 lockups for long periods. My guess is that it stresses a different pathway on the chip that's not hit much by other 3d engines - probably also why the chip passed BFG's QA.

Thanks for the help everyone. Time to get a replacement for the BFG... luckily they have lifetime warrantees.

weevil
08-22-06, 06:17 AM
Flatscreen? Refresh rate force 60Hz. I had these intermittent pauses and found 75Hz set, not good. Yes sounds weird but in desperindis desperado... :)

jesup
08-22-06, 11:07 PM
Flatscreen? Refresh rate force 60Hz. I had these intermittent pauses and found 75Hz set, not good. Yes sounds weird but in desperindis desperado... :)

Nope, 20" Nec FE1250 CRT. I normally use 1280x1024 85Hz; sometimes in 3D 1024x768.

SlowJoeBlow
08-25-06, 06:40 PM
Might be an errant codec...Ya got any of those codec paks installed?