PDA

View Full Version : 6800GT artifacting; advice needed


myshkinbob
04-24-05, 02:21 PM
I've just a had a bit of a scare with my graphics card, so i could do with some advice.

Firstly, does anyone of a good program for testing out the integrity of graphics ram? At the moment i'm using doom3's timedemo demo1 at ultra quality, 1600x1200 4xAA, on the assumption that uses every byte of grahpics ram available.

Secondly, i need to decide if i should RMA my card, my warranty is up in 3 months time.

Here's the situation. My pc wakes itself using the power management alarm in the bios every morning, so windows can load and run a scheduled task to play some tunes, that's my alarm. Well today no music came on, but the pc was on, so i switched on my monitor and wiggled the mouse to wake the VGA output and see what was going on. Except the monitor still said no signal present.

So i hit reboot, and my leadtek bios came up and counted through the graphics ram, then the system bios appeared, then windows started loading. No artifacting at all. At the point you should see the welcome screen, vga signal drops out.

Another reboot, this time i went to boot linux instead, latest kernel, latest forceware. The console appears clean without artifacts, but when xfree86 loads and the forceware drivers initialise, i get an incredibly garbled screen. Artifacting would put it too mildly, it was like nothing i've seen before. Imagine 1600x1200 of utterly ramdom pixels.

So i'm worried, i'm thinking one of the GDDR3 ram chips on my card has gone completely. RMA time.

Another reboot and back to windows, this time with VGA mode enabled via the F8 boot menu. I actually get the welcome screen this time, get to my desktop and it asks me to resize to 800x600, and i do. All still fine, not a hint of an artifact. I notice the forceware drivers have loaded anyway, and i'm not really in VGA mode. My usual desktop resolution is 1600x1200, so i started increasing the res slowly, up to 1280x960 is good. At 1280x1024 i get some slight mis-rendering of text on the screen, for icons etc. Attempting 1600x1200, bang, massive corruption briefly, then vga signal drops out.

I'm starting to wonder if one of the ramdacs has crapped out. I've seen video ram go bad in the past, and usually you get ASCII artifacting at the bios screen as well as pixel artifacts within windows. This seems entirely resolution dependant.

another reboot into vga mode for windows. I raise the resolution to 1280x960, and give rthdribl a run, and it runs, but i get some strange artifacting on my screen, areas of pink and green like an overlay. So i don't think it's the ramdac after all, it's looking more like the ram. Closing rthdribl immediately removes all display corruption.

For the last 9 months this card has run at 400/1100 clocks without any problems. In the forceware drivers, i enabled overlocking and dropped the ram speed to the stock 1000mhz DDR, gave rthdribl a run and no problems. Bumped the desktop res to 1600x1200, no problems. Another run of rthdribl, no problems either. Thought about how i might test out all of the ram, and did the doom3 thing i mentioned above. Raised the mem speed to 1100 again, and ran rthdribl, and got bigtime corruption, but closing rthdribl imediately cleared any corruption again.

I ran the autodetect frequencies tool in the forceware. Not something i trust a great deal, as it usually peaks out at something silly like 460/1250. Now it only reaches 405/1102. Dropping the ram speed to 1050 i get no problems with rthdribl or doom3 on those ram intensive settings.

So the problem appears to be fixed. But what does everyone think? does this happen, where ram suddenly decides it doesn't like to overclock as much as it used to? Or is this a sign that one of the chips is failing, and i should take up my warranty while it's still available to me?

The card is quite modified, at least superficially, and i'd have to replace the original HSF assembly on it, meaning HSF removal. Also i believe leadtek's RMA turnaround time is about 3 weeks at the shortest, so i'm not keen to have to do that. Especially if they just send it back just because it doesn't have the original rubbish thermal paste on it.

I've never really pushed my ram to it's limits of overclockability, i know it used to be able to do 1200 without errors, but i never saw the need to run it so high. 1100 seemed quite modest, and about average for a 6800GT.

Anyway i know this is a bit of a read, but anyone's thoughts on what i should do about this is appreciated. Thanks in advance.

Mudcrutch
04-24-05, 02:26 PM
The card is quite modified

sounds like the overclocking really messed things up..

did you expect to have no problems? :D

myshkinbob
04-24-05, 02:32 PM
Well i say quite modified, it has a different set of fans on it. There are a few other things that i can't really go into right now. But nothing regarding the ram. This is definitely a ram problem, and i haven't done anything "extreme" with the ram on the card. Like i said, it's been at the same ram settings since i bough it last summer, i've not even attempted to push the ram higher in the last 9 months.

Auswak
04-24-05, 02:44 PM
One of those situations where you go outside of NVIDIA specs and overclock you shorten the lifespan of your card.

:o

myshkinbob
04-24-05, 03:06 PM
Sigh :)

I don't want to be rude, really, but i think i'm going to have to add this. I'm well aware i ran the card out of spec. I'm well aware that has technically voided the warranty already. But i didn't get this card and overclock the crap out of it until it failed, i've never even pushed the ram hard enough to see it artifact at it's physical limits, ever.

I can see this thread turning into some moral debate over the rights of leadtek not to receive cards on warranty after you overclock. If you're too tempted to post about that, make another thread and have your say there.

The fact is the majority (though not all) card owners overlock a little, at one time or another. That's just a fact, most people on here overclock a little, it's about being sensible about it. If you run the cards at silly speeds, of course you're asking for trouble. And if it breaks, they'll probably warranty return it anyway. If it doesn't break, people often still sell on a card after they've had their play with it and found it's limitations. Maybe that can go in another thread, the morality of selling a once overclocked card to someone else.

I haven't pushed the ram well beyond spec, it was a 10% overclock, that appeared stable for thousands of hours of operation. I didn't alter my ram timings. I certainly never made any physical mods to the PCB, any kind of crazy VDDR mod you see on certain forums.

The point is i didn't make this thread to discuss the right and wrong of the warranty return. I'm looking for some technical insight, as to why the ram still operates perfectly at a very small difference in speed, after working so well for a long time at a certain speed. And what significance that has toward it's future operation. For a while i thought this forum was for enthusiasts to help each other with different problems and discuss the cards themselves. Share a bit of knowledge and a general interest in the hardware. Not somewhere where people get jumped on for doing anything That isn't in the instruction manual.

[/rant]

Excuse the bad mood. :)

If this turns into a thread about the long term effects of modest overclocking, that's cool. I assume i'm not the only person with experience of that, and sharing that kind of information on here, well it can only be useful for other people in future. :)

ATiMan
04-24-05, 04:19 PM
Perhaps the cooling isnt efficient, the card's fan has lost its maximum rpm and needs replacement or clean or oil.
Even if the ram chips have been deteriorated due to heat, then how are you gonna rma the card, if the problem appears only when overclocking..

myshkinbob
04-24-05, 04:30 PM
that's a good point, i hadn't thought of that, that at stock speed the problem isn't apparent, so the RMA wouldn't work. :o

One of the first things i checked was the fans, they seem to be working fine. And core/ambient temps seem the same as usual.

thanks for the input though. You might actually be right, i neglected to check if the fan was spinning when i first found the system without vga signal. It's possible the fan failed to start at 5V on this morning's initial boot, and temperatures crept up causing slight damage to the ram. The core and ram share the same heatsink, and i know the core puts out an awful lot more heat than ram itself.

sandeep
04-25-05, 07:46 AM
So it turns out all of a sudden your card decides not to run at 400/1100. Very interesting. It seems to have a brain of its own. It doesn't even like higher resolutions when overclocked. This must be so annoying.

Wonder if some transistors are dead and failed to operate at the given speed?

I think you might want to reflash its BIOS again.

myshkinbob
04-26-05, 04:30 PM
I thought i'd update this thread in case anyone gets this same problem. It's fixed now. The problem wasn't dying video ram. I started getting screen corruption even at default video ram speeds today, which was worrying. I'd also noticed the driver's internal tests cleared the video ram at the same speeds as before i started getting this corruption problem. Then i remembered something.

It's one of two things. Firstly i'd had to remove the card recently, from the agp slot, and it's possible that when i put the card back, a bit of dust got caught between one of the contacts in the slot.

But more likely is it agp slot voltage. For the last 6 months i'd run the agp slot at +0.1v, because my mainboard undervolts everything a little at default. The other day i had to clear my mainboard cmos, and i'd forgotten to put the agp voltage back to 1.6v.

Reseating the card and putting the Vagp back at +0.1v in my mainboard bios seems to have fixed the massive corruption issues.

I found an interesting way to test video ram integrity. The leadtek utility, winfox, has a small app that informs you of graphics memory usage. So i ran that, and two instances of rthdribl, resizing the two windows until all video ram was in use. Then i started bumping up my video ram speed until the internal driver test failed. Here's a screenshot.

http://www.vapulus.com/vramfull.jpg

Then i left that running for an hour, letting it load in several of the different scenes it goes through. All clear.

So if anyone else's card suddenly develops major screen corruption, this might be something to check before you do an RMA. Check the slot is clean, reseat the card, and double check your mainboard bios AGP settings to make sure nothing is different from before the problem occured.

Thanks everyone for the replies and trying to help me figure this problem out. :)

smthmlk.
04-26-05, 05:55 PM
The fact is the majority (though not all) card owners overlock a little, at one time or another.

I think you meant to say "a majority of card owners on nvnews.net / rage3d / [SOMEOTHERENTHUSIASTWEBSITEHERE] overclock a little at one time or another". I think it's safe to say a good majority of card owners in general do not touch their gpu/memory speeds. I, for one, do not overclock video cards; they run way too hot at stock speeds with their poor stock cooling solutions, and are generally too delicate for my tastes. But that's just me :)

Other than that, your situation is unfortunate and I hope you find a fix for your card or get a working one. Good luck!

*EDIT* woops, i see you fixed it :) Nice