PDA

View Full Version : Inquirer: Nvidia G92s and G94 reportedly failing


Pages : [1] 2

josiahsuarez
08-12-08, 04:48 PM
this is the first I've heard of desktop boards failing. FWIW my 9600GT G94 is still working perfectly. file this under FUD for now...

http://www.theinquirer.net/gb/inquirer/news/2008/08/12/nvidia-g92s-g94-reportedly
NVIDIA IS IN DEEP trouble over the defective parts problem, and from what we're being told, this is only the tip of the iceberg. NV still insists on stonewalling and spinning because the cost of owning up to the problem could very well sink the company.

If you haven't been following the story, the short version, up till now, is that all G84 and G86 chips are bad. Nvidia is blaming everyone under the sun, but denying they have any hand in the failures. While this may sound plausible, technical analyses by people intimately involved in the requisite semiconductor technologies tell The INQ that it is a bunch of bull: NV simply screwed up. Badly. If it was a problem with the suppliers, NV would not be paying out more than the chip cost, much less gagging OEMs: it would simply be passed along.

In any case, the official story is that there was a small batch of parts given only to HP that went bad. That was comprehensively proved wrong when Dell, Apple, Asus, Lenovo and everyone else under the sun also had problems. NV AR recalled the parts and recanted the story about it only being an EOL test run. Bad fibbers, no cookie. They still stuck to the story about it being only laptop parts, and that it was under control.

If you think it is under control now, the following is part of an email sent Monday by a very tech-savvy reader. "We just got our first casualty from the Nvidia mobile graphics [expletive deleted]. Laptop used by one of our senior engineers started acting up this past weekend. Won't boot except in SAFE mode. Called Dell, they tried a few things, gave up, stated it was the graphics module, and said that because they were SO swamped dealing with that issue, they were just going to send a completely new laptop!"

There are two messages here which have echoes in earlier emails received over the past few weeks. First is that Dell is replacing full laptops over this, contrary to what they claim (read the comments here and here for more). The second is that the small 'under control' problem is far from that. If they had a handle on it, they would not be so far behind and drowning in backorders. Anyone want to bet Dell isn't going to get stuck with the bill here?

To make matters more laughable, the fix that NV is forcing on Dell, HP and everyone else does not fix the problem, it simply makes it less likely to occur during the warranty period. With HP now offering an extended warranty period, and Dell looking likely to do the same, this will only multiply the cost. Add in the fact that Nvidia is sending out defective parts as replacements (there are no good ones), and you have a recipe for a long and expensive tale.

That is where we stand now - NV is simply stonewalling everyone and the costs are adding up. How adult of them. The question of why still remains though, and with another little tidbit of information, it becomes quite clear. There was a digitimes article on July 25, here if you are a subscriber, that said: "Due to Nvidia not clearly explaining the details of the faults reported in its notebook GPUs, some channel vendors have demanded graphics card makers issue a recall for desktop-based discrete graphics cards using the same GPU core, according to sources at graphics card makers."

Reading that, it sounds a mite odd: why would Nvidia keep the partners in the dark like that? They have to be told what the real story is for business reasons, right? When you see stories like these, it is very likely that they are not what they seem, and that the story is simply a nice face-saving Asian 'hello' applied with a backhand.

A little digging revealed what this, and more, is all about, and it's far uglier than just the 'notebook' version. It seems that four board partners are seeing G92 and G94 chips going bad in the field at high rates. If you know what failures look like statistically, they follow a Poisson distribution, aka a bell curve. The failures start out small, and ramp up quickly - very quickly. If you know what you are looking for, you can catch the signs early on. From the sound of the backchannel grumblings, the failures have been flagged already, and NV isn't playing nice with their partners.

Why wouldn't they? Well, the G92 chip is used in the 8800GT, 8800GTS, 8800GS, several mobile flavours of 8800, most of the 9800 suffixes, and a few 9600 variants just to confuse buyers. The G94 is basically only the 9600GT. Basically we are told all G92 and G94 variants are susceptible to the same problem - basically they are all defective. Any guesses as to how much this is going to cost?

From the look of it, all G8x variants other than the G80, and all G9x variants are defective, but we have only been able to get people to comment directly on the G84, G86, G92 and G94, and all variants thereof. Since Nvidia is not acknowledging the obvious G84 and G86 problems, don't look for much word on this new set either - if they can bury it, it will drop their costs.

In the end, what it comes down to is that the problem is far bigger than they are admitting, and crosses generational lines, process lines, and OEM lines. Nvidia is quick to point the finger at everyone but themselves, but after a while, the facts strain those cover stories well past breaking point. There is a common engineering failure here - this problem is far too widespread for it to be anything else. The stonewalling, denials and partner gagging is simply a last-ditch attempt at wallet covering.

With OEMs extending warranties, Nvidia is going to have to cover a lot of laptops for a long time. Desktop boards are going bad as well now, contrary to the statements of Nvidia PR and AR, and the hole keeps getting deeper and deeper. I wonder if they can ever come clean and survive. µ

lightman
08-12-08, 05:08 PM
TheInq.

/thread

CaptNKILL
08-12-08, 05:31 PM
I haven't heard of anyone having problems with 9 series cards.

Runningman
08-12-08, 05:36 PM
hmmm, i"m willing to bet someone has a bunch of "put" options and "shorts" placed on Nvidia.....

npras42
08-12-08, 05:54 PM
I haven't heard of anyone having problems with 9 series cards.

Soon as I saw this thread title, the question I was going to ask was how many people here have had a G92/G94 or know of someone who's had G92/G94 fail?

slaWter
08-12-08, 06:34 PM
I haven't heard of anyone having problems with 9 series cards.

Same here. This is BS.

bacon12
08-12-08, 07:24 PM
I never believe anything the Enq says about Nvidia.

jcrox
08-12-08, 10:29 PM
The Inq :thumbdwn:

nekrosoft13
08-12-08, 11:30 PM
wow inng again. ****ing loser writes another bs article.

josiahsuarez
08-12-08, 11:36 PM
Inq seems to be the only site running this story atm, and I'm not willing to believe it based on their word alone. apparently this Charlie guy from Inq has a serious bee in his bonnet about nvidia.

killahsin
08-13-08, 04:48 PM
i own multiples of both and neither have failed i own 2 9600 gt's and 2 9800 gtx's.

Q
08-13-08, 05:25 PM
I don't think I've read one thread at this forum with a hardware defect concerning these cards. And this would be the FIRST place I would expect to hear it.

buicks suck
08-13-08, 05:50 PM
This story is bs. My 8800 gt is fine; and you can tell this report is bs from reading the first line.

ViN86
08-13-08, 06:09 PM
let's make a thread to test this. ill start it.

ViN86
08-13-08, 06:13 PM
vote here:
http://www.nvnews.net/vbulletin/showthread.php?t=117797

Butter Bandit
08-13-08, 06:14 PM
I don't think I've read one thread at this forum with a hardware defect concerning these cards. And this would be the FIRST place I would expect to hear it.

+1

If there were failure rates this high going on, we'd already know...and this is the first I've heard of it.

ViN86
08-13-08, 06:17 PM
that's why i started a thread to vote:

http://www.nvnews.net/vbulletin/showthread.php?t=117797

it will dispel this POS article.

namuk
08-13-08, 06:22 PM
thought this was only based on the mobel gpu not the card . i have the 8800gt and that is ok

mtl
08-13-08, 07:00 PM
the guy's a jealous loser who needs to get a life.

MikeC
08-14-08, 11:03 AM
NVIDIA's response:

Myth 1 - NVIDIA has denied responsibility for the failures and is blaming suppliers and partners.

In our announcements accept responsibility for the failures. We DO call out the material failure but we also acknowledge that our suppliers and notebook designs because this is true and we need to disclose this in our official statements to the SEC. We would not go on record with the SEC making such bold claims if they weren’t true. See our Form 8-K statement below.

Myth 2 – There is an “official story” that the problems were limited a batch of a few bad parts for HP.

We have never issued a statement like this. See our publics statements below.

Where is source for that?

Myth 3 – NVIDIA is forcing a fix on notebook makers

The idea that a supplier like NVIDIA can dictate a fix to the world’s largest PC makers is preposterous.
The truth is the notebook makers determining their own course of action and we are supporting them.

Where is source for that?

Myth 4 – NVIDIA is trying to cuts our financial liability.

We put aside $200M to help partners solve this problem for consumers. As far as we know NVIDIA is the first and only chip maker to help fund the cost for repairs.


Myth 5 – This affects desktop chips, G92, G94, etc.

We have only seen this problem on notebooks. We just reiterated this during an official financial call. Once again we would not say this if it wasn’t true. Note we have not disclosed the specific GPUs but we have stated this impact previous generation GPUs and that current gen GPUs are not in production.

Fact

Charlie has an obvious bias against NVIDIA and he has no sources to back up his claims. Out of all of the hundreds upon hundreds of notebooks models designed with NVIDIA chips in the last few years, only a small number of these have experienced the problem. Within this small number of models, only a small percentage actually experience the chip failure. It is highly unlike a notebook user will experience the problem. And we have never seen this problem on desktop.

Official communication about the notebook chip material failures.

Quarterly Business Update Press Release – July 2
http://www.nvidia.com/object/io_1215037160521.html

“Separately, NVIDIA plans to take a one-time charge from $150 million to $200 million against cost of revenue for the second quarter to cover anticipated warranty, repair, return, replacement and other costs and expenses, arising from a weak die/packaging material set in certain versions of its previous generation GPU and MCP products used in notebook systems. Certain notebook configurations with GPUs and MCPs manufactured with a certain die/packaging material set are failing in the field at higher than normal rates. To date, abnormal failure rates with systems other than certain notebook systems have not been seen. NVIDIA has initiated discussions with its supply chain regarding this material set issue and the Company will also seek to access insurance coverage for this matter. “

Form 8-K – July 2
http://www.secinfo.com/d14D5a.t4ehp.htm

“On July 2, 2008, NVIDIA Corporation stated that it would take a $150 million to $200 million charge against cost of revenue to cover anticipated customer warranty, repair, return, replacement and other consequential costs and expenses arising from a weak die/packaging material set in certain versions of our previous generation MCP and GPU products used in notebook systems. All newly manufactured products and all products currently shipping in volume have a different and more robust material set.

The previous generation MCP and GPU products that are impacted were included in a number of notebook products that were shipped and sold in significant quantities. Certain notebook configurations of these MCP and GPU products are failing in the field at higher than normal rates. While we have not been able to determine a root cause for these failures, testing suggests a weak material set of die/package combination, system thermal management designs, and customer use patterns are contributing factors. We have developed and have made available for download a software driver to cause the system fan to begin operation at the powering up of the system and reduce the thermal stress on these chips. We have also recommended to our customers that they consider changing the thermal management of the MCP and GPU products in their notebook system designs. We intend to fully support our customers in their repair and replacement of these impacted MCP and GPU products that fail.

We have begun discussions with our supply chain regarding reimbursement to us for some or all of the costs we have incurred and may incur in the future relating to the weak material set. We will also seek to access our insurance coverage. We continue to not see any abnormal failure rates in any systems using NVIDIA products other than certain notebook configurations. However, we are continuing to test and otherwise investigate other products. There can be no assurance that we will not discover defects in other MCP or GPU products.”

Press statement – emailed to press July – 15


“NVIDIA’s highest priority is to ensure complete satisfaction and delight for all of our customers. We fully stand behind our products and are cooperating with our partners to resolve the recently announced notebook field failure issue.

Please remember the following:

1) The issue is limited to a few notebook chips only; we have not seen and don't expect to see this issue on any NVIDIA-based desktop systems.

2) Only a very small percentage of the notebook chips that have shipped are potentially affected, and the problem depends on a combination of environmental conditions, configuration and usage model.

3) We continue to work closely with our partners and have taken the necessary steps to ensure that all NVIDIA chips currently in production do not exhibit the problem.

As a result, it is very unlikely that your NVIDIA-based notebook product is affected.”

Financial call transcript – August 13

http://seekingalpha.com/article/90644-nvidia-f2q09-qtr-end-7-27-08-earnings-call-transcript?source=yahoo&page=-1


Jen-Hsun Huang

“We also noted that we would be taking a non-recurring charge against cost of revenue to cover anticipated customer warranty, repair/return replacement and other associated costs resulting from a weak die/packaging material set in certain previous generation GPU and MCP products using notebooks. Although the failures are only seen in a small percentage of all the chips we shipped with this material set, the repair cost of a notebook can be expensive. In total, we took a charge of $196 million. We will continue to support on OEM partners on responding and resolving end customer issues.”


Jen-Hsun Huang – in response to whether we expect to incur additional charges beyond what we set aside to assist notebook makers with repairs

“We’re not expecting more write-downs in the future. When we scoped out the problem, we had -- we felt we had enough data to project out the anticipated failures from the various platforms that are out there. This doesn’t happen to all of our chips and it doesn’t happen to most of the notebooks that are out there. There are only a few examples of them and of all the notebooks that have shipped. So we think we have a pretty good handle on the situation but -- and we thought that we were relatively conservative but we’ll see how it goes.”


Jen-Hsun Huang – in response to a question about how this impact future orders from notebook makers

Frankly, on the work that we are doing supporting our OEMs to help them repair and to support their end users, frankly all of our engagement with all of our OEMs, they have been just delighted by the work that we are doing. Obviously this isn’t something we absolutely need to do but we stepped up to do it because we think it’s the right thing to do. And so each case is a little different, so we have to look at each case carefully but our open-minded approach and our good partnership approach is welcomed by all the OEMs. And so if this is going to be anything at all, it should be a positive.

nekrosoft13
08-14-08, 11:06 AM
thanks mike

whowhere
08-14-08, 01:20 PM
Doesn't it seem a bit strange and disturbing that the quote below doesn't mention any problems with overheating of the desktop gtx200 series of graphics cards. If you go to the forum on 200 series cards you will see how extensive this problem is.


"Please remember the following:

1) The issue is limited to a few notebook chips only; we have not seen and don't expect to see this issue on any NVIDIA-based desktop systems"

ViN86
08-14-08, 09:04 PM
Doesn't it seem a bit strange and disturbing that the quote below doesn't mention any problems with overheating of the desktop gtx200 series of graphics cards. If you go to the forum on 200 series cards you will see how extensive this problem is.


"Please remember the following:

1) The issue is limited to a few notebook chips only; we have not seen and don't expect to see this issue on any NVIDIA-based desktop systems"
and ATI's cards never had cooling problems? :confused:

overheating issues are not caused by a signle problem. they are caused by multiple issues. some ppl fail to realize that if you stick a GTX 280 in a case with no intake fan and a 90mm exhaust, you will probably have overheating issues. most overheating issues are a direct result of poor air circulation and layout by the user.

besides, it says G92's and G94's. GTX 280's are the GT200 core.

noko
08-17-08, 03:12 AM
Seem like Nvidia is doing the right thing in supporting the OEMs on any failure dealing with this part. This kind of stuff happens to a number of manufactureres. Look at the Car industry. The good is if one has a faulty part, a chance we all take, someone is very concern in getting it fixed as soon as possible.

josiahsuarez
08-30-08, 12:46 AM
looks like Charlie has done it again. his argument this time is a bit more elaborate and seems to boil down to "Nvidia is releasing new revisions of their chips, therefore this proves all the old chips were faulty or they wouldn't have to make a new revision"


http://www.theinquirer.net/gb/inquirer/news/2008/08/28/nvidia-55nm-parts-bad
Nvidia 55nm parts are bad too

NVGate Changed for 'no reason' once again

By Charlie Demerjian: Thursday, 28 August 2008, 6:47 PM

HOT ON THE heels of its denials that anything is wrong with the G92 and G94s comes another PCN that shows the G92s and G92b are being changed for no reason. Yup, the problems that are plaguing G84 and G86 are the same that affect seemingly all 65nm and now 55nm Nvidia parts.

This PCN is very similar to the one linked above, and the formatting is almost almost exactly the same, so we won't cover all the details, just the pertinent points. This one is much more important, it confirms that the problems are not confined to the 65nm products. Since Nvidia told us the last one was unimportant and refused to give it to us, we didn't bother asking this time, we just took notes when they were shown to us at a recent conference.

It is titled "G92 GPU Desktop Products" with a subtitle of "Change Bump Material from High Pb to Eutectic Solder", with a date of June 2008 and a number PCN0346A on it. Page 2 has the "PCN Submit Date" of June 13, 2008, " Planned Implementation Date" of July 28, 2008, and a "Proposed First Ship Date for change" of August 17, 2008. Short story here, if you have a G92 or G92b purchased before next week, you likely have a lemon. Remember, these are chip ship dates, not boards in stores.

The next few chunks, "Change Category" and others are the same, "Class 1", given to everyone under the sun, and OMGWTFBBQ. That is kind of a 'well duh' thing, and is exactly the same as the G86 part PCN.

The big one is the affected parts list. It clearly states that not only are 65nm parts bad, but 55nm ones are as well. The entire list of affected parts is as follows.


Small batch, my arse

Lets see, what do we have here? It looks like they changed the bumping material on the 55nm parts a month and a day after introduction. Yup, no reason for that at all, nothing to see here either.

The next part is a description of what we already knew and told you about on the last PCN story. To use their words, "Nvidia will transition from using high-lead solder (95%Pb/5%Sn) to eutectic solder (63%Sn/37%Pb) flip-chip bump material for the G92 product family. During the transition period Nvidia will be supplying both high-lead and eutectic bump until inventory is depleted. No other materials are being changed."

This makes complete sense, and it is followed by a picture of a modern chip with the bumps and underfill pointed out.

The reasons are the same, supply and robustness, as is the impact statement. Same very curious wording. Nothing new, just bad news.

The "Implementation and Qualification Plan" however does have some new news. It says, "Nvidia has previously qualified numerous products using eutectic solder bumps using the same bump suppliers, substrate vendors, underfill and assembly sites as this device. Qualification data is available upon request." This information backs up our previous assertions that this is quite widespread among all their 65nm and 55nm products. Qual data is available "Now," it says, and samples on July 1, 2008.

Page 4 has the same diagram, and indicates that the eutectic bumps are marked the same way as the G86 ones, with a trailing R on the lot #. Because it is etched on the die, you have no way of knowing which one you have until you take it apart, pull the heatsink, clean off the thermal paste, and read the laser-wielding chicken scratchings. Most stores won't let you do this, and NV is going to be mixing the dies up until they burn off inventory. This means you won't be safe until long after the card is irrelevant, say later in Q4.

The "Recommended Action" and contact info is the same as the G86 PCN, and the Revision history has an Initial Release date of 06/13/08. There is no blank page 5 on this one, it is just the disclaimer that was on page 6 of the last one.

While Nvidia is playing these PCNs off as nothing to worry about, they are. The fact that the defective chip problem extends to the G92 line like we said earlier is bad enough. It pretty much confirms that the problem is the same as the "Small batch of EOL laptops parts only given to HP," that they warned about in July. The bigger problem is that it affects the newer 55nm parts as well. Those were supplanted in a number of days you could almost count to on your fingers and toes if you grew up in a small town in Appalachia, never a good sign. In fact, qual samples were available before the 9800GTX+ actually launched.

It is hard to overstate how bad this is. Basically every 65nm and 55nm Nvidia part appears to be defective. It is not a question of yes or no, but how defective each line is, and what the failure rate for each one is. We are hearing of early failure rates in the teens per cent for 8800GTs and far higher for 9600GTs, so this is not a quibble over split hairs.

To make matters worse, Nvidia has a mound of unsold defective parts that they are going to bleed out into the channel along side of the (hopefully) fixed parts. As a buyer, you have no way of knowing which one you are getting, and it looks like Nvidia isn't keen on helping you figure it out either, that would cost too much.

Until Nvidia comes fully clean on this fiasco, lists all the defective parts, and orders boxes clearly marked, you can't say anything other than just avoid them. Then again, since doing the right thing would likely bankrupt them, we wouldn't hold your breath for it to happen. µ