Go Back   nV News Forums > Linux Support Forums > General Linux

Newegg Daily Deals

Reply
 
Thread Tools
Old 12-12-06, 08:38 PM   #25
netllama
NVIDIA Corporation
 
Join Date: Dec 2004
Posts: 8,763
Default Re: nForce 4 corrupting data written to HDD

Clearly, something differs between your system and mine, as this problem is not reproducing here.
netllama is offline   Reply With Quote
Old 12-13-06, 10:39 AM   #26
netllama
NVIDIA Corporation
 
Join Date: Dec 2004
Posts: 8,763
Default Re: nForce 4 corrupting data written to HDD

krader,
Are you able to reproduce the corruption that domasj reported using the Problem.zip file contents?
netllama is offline   Reply With Quote
Old 12-13-06, 02:14 PM   #27
domasj
Registered User
 
Join Date: Dec 2006
Posts: 11
Default Re: nForce 4 corrupting data written to HDD

netllama, I don't believe that krader can possibly reproduce the problem with the photos as I have noticed such behaviour only when copying much data and it happened randomly (I'll check that out later).
I have formatted a partition and installed 64 bit Ubuntu edgy in it. I'm doing some tests now and I have discovered that my reading mismatches I was having before are still there. However, now I can accurately reproduce them. I have tried that for a few times and it happens as it did before so I believe this should be a good proof of the problem.

Code:
-rw-r--r-- 1 domas domas 634583040 2006-12-13 21:31 Desktop.tar

645c09f962356ab6901607b396275e8e  Desktop.tar
645c09f962356ab6901607b396275e8e  Desktop.tar
645c09f962356ab6901607b396275e8e  Desktop.tar

-rw-r--r-- 1 domas domas 655257600 2006-12-13 21:35 Desktop.tar

0af2d72393ded72651d84a257b3412ea  Desktop.tar
0af2d72393ded72651d84a257b3412ea  Desktop.tar
0af2d72393ded72651d84a257b3412ea  Desktop.tar

-rw-r--r-- 1 domas domas 665600000 2006-12-13 21:38 Desktop.tar

f7d439b09a95fe4b9867ea5d8c955aab  Desktop.tar
738730c3bf2e57edd3eef3b76b9a6d6e  Desktop.tar
382d0b7875136031b03fef4cda5fb734  Desktop.tar

-rw-r--r-- 1 domas domas 686663680 2006-12-13 21:14 Desktop.tar


bdcb1d32645e2f605760c45d6f2832d9  Desktop.tar
d99e275f7cf2514abe4e84a94c8ba618  Desktop.tar
435af7e1b915ceca1cf13931248ed1c6  Desktop.tar
Code:
uname -r
2.6.17-10-generic
domasj is offline   Reply With Quote
Old 12-13-06, 03:10 PM   #28
netllama
NVIDIA Corporation
 
Join Date: Dec 2004
Posts: 8,763
Default Re: nForce 4 corrupting data written to HDD

I'll still need someone to provide me with a reliable & consistant means of reproducing this problem. I'm not seeing this on any of the assorted NFORCE systems that I have here (and I have a large number).

Thanks,
Lonni
netllama is offline   Reply With Quote
Old 12-16-06, 04:11 PM   #29
domasj
Registered User
 
Join Date: Dec 2006
Posts: 11
Default Re: nForce 4 corrupting data written to HDD

I recently got tired of testing my problem on Linux and decided to try that out in windows. However, I soon ran into a more obvious problem - installation even doesn't complete. It crashes either after restart or somewhere in the middle with different error every time. This behavior is seen on both of my disks. Now it seems like it is really hardware fault not software. Thus, I contacted the seller and they agreed to change the motherboard and the RAM stick to see if that helps. More information will come by the end of this week.

P. S. memtest86+ ran 3 passes on my system without any error whatsoever.
domasj is offline   Reply With Quote
Old 12-16-06, 08:25 PM   #30
chunkey
#!/?*
 
Join Date: Oct 2004
Posts: 662
Default Re: nForce 4 corrupting data written to HDD

anyway... maybe a little bit too late, but have you checked your temperatures (especially HDDs & Chipset) & PSU voltages? I haven't seen any "strange" corruption ever since I reordered the wires and put some decent fans in the case)
chunkey is offline   Reply With Quote
Old 12-18-06, 05:50 PM   #31
krader
Registered User
 
Join Date: Dec 2006
Posts: 11
Default Re: nForce 4 corrupting data written to HDD

In my case I have confirmed that power-supply voltages and all temperatures are well within operational limits.

I put the mainboard in a spare case and booted Knoppix (32-bit 2.6 kernel). I started by copy and compare test using the VMware guest image files that have exhibited corruption in the past. I let it run for a little over 12 hours (43 passes) and only a single file was corrupted. The corruption consisted of three bytes in a single 32-bit word being incorrect.

I then installed Fedora Core 6 (what was being used in the installation where the problem was first noticed). The stock FC6 kernel has problems driving the SATA controller. It reports errors then resets the controller and programs the drives to the slowest speed. It continues doing that after transferring what appears to be a couple of megabytes of data. I'm going to try and find time to build a kernel.org kernel (which is what I was using in the original installation).

Note that the disks are different than in the original installation which might explain why it is more difficult to trigger the failure.
krader is offline   Reply With Quote
Old 12-18-06, 05:56 PM   #32
netllama
NVIDIA Corporation
 
Join Date: Dec 2004
Posts: 8,763
Default Re: nForce 4 corrupting data written to HDD

What kind of errors was the FC6 kernel reporting? Have you considered the possibility that perhaps you have a faulty motherboard?
netllama is offline   Reply With Quote

Old 01-02-07, 06:04 PM   #33
krader
Registered User
 
Join Date: Dec 2006
Posts: 11
Default Re: nForce 4 corrupting data written to HDD

The FC6 SATA errors were due to a defect in that kernel's SATA subsystem and have nothing to do with this problem.

I've run numerous tests the past week using kernel.org 2.6.19. I've tried a SMP kernel booted with "maxcpus=1" as well as a monoproc kernel built from the same .config file with the only change being the SMP option not being selected. I've also tried with "mem=1g" and "mem=2g". I've also tried with "iommu=soft". All result in corruption as previously described.

In the past week I've logged 112 corruption events out of 892 test cycles (12.5% failure rate). Of the nine files being copied It is always one or more of files 1, 3, 4 or 5 that are corrupted. The other five files have never been corrupted. File #4 shows corruption at twice the rate of the other three:

27 file1
25 file3
41 file4
18 file5

So, yes, I think the mainboard is defective but probably not in the manner you are suggesting. As an educated guess I think there may be a design defect with either the nForce 4 chipset or the ASUS A8N mainboard that is causing bus crosstalk. But given that others have reported similar problems with mainboards from other vendors that are based on the nForce 4 chipset it seems likely the fault lies with the nForce chipset. Or perhaps the nForce specifications that mainboard designers are using.

What would you like to do? I've got a system on which I can reproduce this problem 12.5% of the time. That system won't be used for anything else until root cause is determined or we agree to give up (at which point I'll throw the mainboard in the garbage since it can't be trusted). What tests would you propose I run? What data would you like me to provide?

Using this web site is awkward. If you would prefer to email me directly to discuss an action plan please do so: krader@skepticism.us. If you email me I'll reply with phone numbers where you can reach me.
krader is offline   Reply With Quote
Old 01-02-07, 06:33 PM   #34
netllama
NVIDIA Corporation
 
Join Date: Dec 2004
Posts: 8,763
Default Re: nForce 4 corrupting data written to HDD

Actually, if you're not using this system for anything else, it would be best if you could ship it to me. Once I have a system which reliably reproduces the problem, we would be equipped to investigate further. If shipping the system is an option, please PM me, and I'll provide you with my shipping address.

Thanks,
Lonni
netllama is offline   Reply With Quote
Old 01-03-07, 09:19 AM   #35
calestyo
Christoph Anton Mitterer
 
Join Date: Dec 2006
Location: München, Germany
Posts: 48
Send a message via ICQ to calestyo Send a message via AIM to calestyo Send a message via MSN to calestyo Send a message via Yahoo to calestyo
Default Re: nForce 4 corrupting data written to HDD

Quote:
Originally Posted by netllama
How reliably does this corruption reproduce?
Do you have a file that will become corrupted every time it is copied?
It should be very reproducable,... and no it is not always the same file that gets corrupted.
Please read the whole thread at lkml (see my other report in the forum here for the link to my initial post) as it probably contains answers to most of your questions... It also contains detailed information in which cases the error can happen (at both, reads and writes)

Christoph Anton Mitterer.
calestyo is offline   Reply With Quote
Old 01-03-07, 09:22 AM   #36
calestyo
Christoph Anton Mitterer
 
Join Date: Dec 2006
Location: München, Germany
Posts: 48
Send a message via ICQ to calestyo Send a message via AIM to calestyo Send a message via MSN to calestyo Send a message via Yahoo to calestyo
Default Re: nForce 4 corrupting data written to HDD

Quote:
Originally Posted by domasj
I had a computer with nForce 2 it was working very well.
It most likely happens with several nForces. I'll definitely know it for nforce Professional 2200 and 2050 and nforce 4.

Quote:
Originally Posted by domasj
However, when I changed the computer to a newer one I transferred a PATA disk to the new computer and it didn't last a month till I get first ext3 inconsistency. When fsck passed everything were looking all right till it started to happen more frequently in the end. Some system files seemed corrupted and were being load.
Thus, I decided to do a clean debian amd64 in my second SATA disk. I have been living happy since recently it started to show the same signs. Now it boots with some scary messages appearing and it stops when it cames to starting xorg (nvidia logo flashes a few times) and a message appears that xorg startup didn't suceed. Even mplayer segfaults.
I don't have a spare HDD to corrupt right now but I will probably get one after the week end. Then I will be able to something more.
I would be glad if you looked at the mailing list I wrote about. There are some more sophisticated problem reports than mine is.
If you actually suffer from "our" problem at lkml,.. than the errors you describe here (boot messages, filesystem errors) are most likely "random" errors that results from corruptions that happened to involved files or filesystem parts.
calestyo is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


Similar Threads
Thread Thread Starter Forum Replies Last Post
Maintain Your Privacy by Manually Accepting and Rejecting "Cookies" (nV News) MikeC Open Forum 2 02-02-13 07:15 PM
Verizon's shared data plans won't save solo users much money News Archived News Items 0 06-12-12 10:40 AM
Verizon announces 'Share Everything' plans ' the future of mobile data (sort of News Archived News Items 0 06-12-12 10:40 AM
New Paper: MPI-ACC ' An Integrated Approach to Data Movement in Accelerators News Archived News Items 0 06-02-12 03:00 AM

All times are GMT -5. The time now is 02:02 PM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright ©1998 - 2014, nV News.