Go Back   nV News Forums > Linux Support Forums > General Linux

Newegg Daily Deals

Reply
 
Thread Tools
Old 12-07-06, 10:40 PM   #13
krader
Registered User
 
Join Date: Dec 2006
Posts: 11
Default Re: nForce 4 corrupting data written to HDD

Sorry, I interpreted "PM" to mean you would send me an email. I now see the "private messages" link at the top of this page and have uploaded the files: file1.bz2 thru file4.bz2
krader is offline   Reply With Quote
Old 12-08-06, 12:18 PM   #14
netllama
NVIDIA Corporation
 
Join Date: Dec 2004
Posts: 8,763
Default Re: nForce 4 corrupting data written to HDD

All 4 of these files are reported to have been corrupted when I attempt to bunzip2 them:
bunzip2: Data integrity error when decompressing.
Input file = file1.bz2, output file = file1

The md5sum of the files that you uploaded to the FTP server is the same as what I downloaded from the FTP server.

Am I supposed to be using the files in this state (without bunzip2'ing them first)?
netllama is offline   Reply With Quote
Old 12-08-06, 08:03 PM   #15
krader
Registered User
 
Join Date: Dec 2006
Posts: 11
Default Re: nForce 4 corrupting data written to HDD

I can't believe I made such a bone-headed mistake. I neglected to put my FTP client in binary mode before uploading the files. I've confirmed my local copies are good with "bunzip2 -t" and am resending the files in binary mode.

Here are the sizes and md5sums you should get after downloading them:

-rw-r--r-- 1 krader krader 492496034 2006-12-08 17:22:31 file1.bz2
-rw-r--r-- 1 krader krader 741700564 2006-12-08 17:23:30 file3.bz2
-rw-r--r-- 1 krader krader 1108086333 2006-12-08 17:25:29 file4.bz2
-rw-r--r-- 1 krader krader 330596543 2006-12-08 17:21:16 file5.bz2

43b730dc2d6a98d3b926c71145e344e0 file1.bz2
9e9c33dcb6e9a85d913d0748175acfc2 file3.bz2
cb3290ae77d1e409144de5abd58cee5e file4.bz2
73fcb62d9a1508408a3f25e96b1ffa9c file5.bz2
krader is offline   Reply With Quote
Old 12-11-06, 06:16 PM   #16
netllama
NVIDIA Corporation
 
Join Date: Dec 2004
Posts: 8,763
Default Re: nForce 4 corrupting data written to HDD

I've run with your files & script for 3 iterations, and have not been able to reproduce the problem.

I have a few questions:
0) Are you able to consistently reproduce the corruption within 3 iterations with the files & script you provided?
1) Does this still reproduce if you are using a non-SMP kernel?
2) Does this still reproduce if you reduce the RAM to 1GB or 512MB (booting with the mem= kernel parameter)?

Thanks,
Lonni
netllama is offline   Reply With Quote
Old 12-12-06, 11:07 AM   #17
domasj
Registered User
 
Join Date: Dec 2006
Posts: 11
Default Re: nForce 4 corrupting data written to HDD

I have searched around a bit and found out that this (or very similar) is quite common problem. It was reported that people get data corruption while using nForce 4 based system regardless the OS. http://www.nforcershq.com/forum/1-vt5108.htm - here is a huge thread about that. It seams that this behaviour is caused by hardware not by software. Although people write about problem disappearing after BIOS update, it haven't helped me. Now I have noticed that both NVIDIA GeForce 6150 GPU (with a radiator) and NVIDIA nForce 430 MCP (a bare chip) get that hot during operation that isn't possible to keep a finger on them. Is there a possibility that the heat causes my problems?
domasj is offline   Reply With Quote
Old 12-12-06, 11:29 AM   #18
netllama
NVIDIA Corporation
 
Join Date: Dec 2004
Posts: 8,763
Default Re: nForce 4 corrupting data written to HDD

domasj,
As I stated earlier, the problem that you reported in this thread has different symptoms than the LKML posts you referenced. The nforcershq thread that you referenced appears to have at least a half dozen different, seemingly related problems, with an assortment of potential solutions & workarounds.

You referenced filesystem corruption, whereas the LKML posts are not filesystem corruption, but rather file corruption. You've not provided any additional information to suggest that you're hitting the same issue as in the LKML posts. Are you seeing file corruption or filesystem corruption or both? Please provide detailed instructions on how can I reproduce the problem(s) that you initially reported.

Thanks,
Lonni
netllama is offline   Reply With Quote
Old 12-12-06, 12:03 PM   #19
domasj
Registered User
 
Join Date: Dec 2006
Posts: 11
Default Re: nForce 4 corrupting data written to HDD

The fact is that both of my drives are more or less corrupted. So now I will try to reformat a partition and see what happens with huge files. Also, I should mention that before that serious fs corruption I had several files wrongly copied from one disk to another. I copied about 5 GB of photos and some of them appeared corrupted. I'll look and see if I still can find some missed ones as I had them replaced one by one with the originals and they were successfully corrected. http://jozita.lt/Problem.zip here it is. Three pairs of photos corrupted during PATA->SATA copy operation.

Here are my PATA->fresh ext3 partition on SATA:
17050436099fb9a05ad03ae2cce1a607 debian-update-3.1r4-i386-1.iso - the source

... and the copies (filenames changed on purpose :
a1ce1f3e703aae62ec940eaf5b8019f4 debian-update-3.1r4-i386-0.iso
7762914497922b5fc09b936ec15c094b debian-update-3.1r4-i386-1.iso
1c6e5d218835539f97128e1de3e100a3 debian-update-3.1r4-i386-2.iso
2ac382a59210a3059aa7b2ee3902b0de debian-update-3.1r4-i386-3.iso

Lastly, because of such differences I decided to check them once more. Every file had a different md5sum! Even in the fresh partition it gives different md5sum.

Last edited by domasj; 12-12-06 at 12:33 PM.
domasj is offline   Reply With Quote
Old 12-12-06, 12:47 PM   #20
netllama
NVIDIA Corporation
 
Join Date: Dec 2004
Posts: 8,763
Default Re: nForce 4 corrupting data written to HDD

Are you stating that the original debian-update-3.1r4-i386-1.iso that you copied has had its md5sum change over time, even though its not been touched in any way?

With regard to the jpg's in your Problem.zip, I've just copied them from one SATA disk to another, and their md5sums have remained the same. Do these images consistantly get corrupted every time you copy them?

Have you verified that you're using the latest BIOS for the motherboard?
Additionally, please provide output from the following commands:
cat /proc/cpuinfo
free -m
uname -a

Thanks,
Lonni
netllama is offline   Reply With Quote

Old 12-12-06, 01:03 PM   #21
domasj
Registered User
 
Join Date: Dec 2006
Posts: 11
Default Re: nForce 4 corrupting data written to HDD

Quote:
Originally Posted by netllama
Are you stating that the original debian-update-3.1r4-i386-1.iso that you copied has had its md5sum change over time, even though its not been touched in any way?
Yes it reads differently every time. I also used sha512sum with same results - different reads. The iso file is 620 MB but when I try the same with smaller files I get constant reads.

Quote:
Originally Posted by netllama
With regard to the jpg's in your Problem.zip, I've just copied them from one SATA disk to another, and their md5sums have remained the same. Do these images consistantly get corrupted every time you copy them?
As I said, I have corrected other photos by just copying them once more. I believe it doesn't depend on the file but some part of data stream gets corrupted somewhere.

Quote:
Originally Posted by netllama
Have you verified that you're using the latest BIOS for the motherboard?
Additionally, please provide output from the following commands:
cat /proc/cpuinfo
free -m
uname -a
Yesterday I updated BIOS to 0603 which is the most recent one.
Here you are:
Code:
cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 79
model name      : AMD Athlon(tm) 64 Processor 3000+
stepping        : 2
cpu MHz         : 1808.435
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up pni cx16 lahf_lm svm cr8legacy ts fid vid ttp tm stc
bogomips        : 3619.75
Code:
free -m
             total       used       free     shared    buffers     cached
Mem:           979        954         25          0          3        605
-/+ buffers/cache:        345        633
Swap:          956         66        890
Code:
uname -a
Linux debian 2.6.18-1-686 #1 SMP Sat Oct 21 17:21:28 UTC 2006 i686 GNU/Linux
I think that the photo corruption happened using a 64 bit kernel if that does help.
domasj is offline   Reply With Quote
Old 12-12-06, 01:19 PM   #22
netllama
NVIDIA Corporation
 
Join Date: Dec 2004
Posts: 8,763
Default Re: nForce 4 corrupting data written to HDD

No one in the LKML thread you referenced reported files getting corrupted on-disk. This still sounds like a separate and unrelated problem, potentially with faulty hardware. Earlier you reported filesystem corruption at bootup. Do you have a log of the corruption errors? Have you run memtest86 to verify that your memory isn't faulty?

With respect to the jpg's getting corrupted, that sounds like the same issue as in the LKML posts, however in light of the filesystem corruption, its hard to say for certain whether you're not just hitting a variant. If you copy the same jpg file 100 times, what percentage of the time does it end up corrupted?

How many memory modules are you using, and which brand?

Thanks,
Lonni
netllama is offline   Reply With Quote
Old 12-12-06, 07:03 PM   #23
krader
Registered User
 
Join Date: Dec 2006
Posts: 11
Default Re: nForce 4 corrupting data written to HDD

> 0) Are you able to consistently reproduce the corruption within 3
> iterations with the files & script you provided?

Yes, as stated before at least one file exhibits corruption on every
iteration. I spent four days doing nothing but running tests while changing
variables (e.g., the source of the data, whether regular async or direct I/O
was used).

> 1) Does this still reproduce if you are using a non-SMP kernel?

That is one thing I did not try.

> 2) Does this still reproduce if you reduce the RAM to 1GB or 512MB
> (booting with the mem= kernel parameter)?

Yes, the problem still occurred after booting with "mem=1g". Problem also
occurs if I remove half the memory (2 x 1 GiB DIMMS). Problem still occurs
if I swap the pairs of DIMMs I removed with the ones still in the system.
Memtest86 was run 24 hours on the full 4 GiB without error.
krader is offline   Reply With Quote
Old 12-12-06, 07:13 PM   #24
krader
Registered User
 
Join Date: Dec 2006
Posts: 11
Default Re: nForce 4 corrupting data written to HDD

> No one in the LKML thread you referenced reported files getting corrupted
> on-disk

Please read that discussion thread again. I reported that the on-disk copy was corrupted. Also, I deliberately tested the case where the copy of the file would remain wholly in memory. I copied a file just over 1 GiB in size with no other activity on the system. An immediate cmp(1) of the source and copy showed no errors. I then read two different 2 GiB files in order to clear the buffer cache of those original files; thus forcing the copy to be synced to disk. Running cmp(1) again showed multiple bytes were corrupted.

Test results like the above are why I originally suspected the fault lay with the nForce 4 SATA controller. However, subsequent tests showed the corruption occurs even when the disks are attached to a Promise TX2 controller in a PCI slot and when using the onboard Silicon Image 3114 RAID controller. Multiple disks of different models were also tested and the problem occured with each disk. The SATA cables were also replaced without affecting the symptoms.

So we know the problem is definitely not the SATA controller, cables, or disk. There is good reason to believe the problem is not due to defective memory. This leaves the nForce 4 chipset, the AMD Athlon64 X2 CPU, a mainboard design error, or an error in how the ASUS BIOS is configuring the hardware as the remaining possibilities. Since different people are reporting similar systems with mainboards from different vendors the mainboard design can probably be ruled out. Note that as part of testing this I flashed the BIOS to the most recent version. That had no effect on the symptoms.
krader is offline   Reply With Quote
Reply


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


Similar Threads
Thread Thread Starter Forum Replies Last Post
Maintain Your Privacy by Manually Accepting and Rejecting "Cookies" (nV News) MikeC Open Forum 2 02-02-13 07:15 PM
Verizon's shared data plans won't save solo users much money News Archived News Items 0 06-12-12 10:40 AM
Verizon announces 'Share Everything' plans ' the future of mobile data (sort of News Archived News Items 0 06-12-12 10:40 AM
New Paper: MPI-ACC ' An Integrated Approach to Data Movement in Accelerators News Archived News Items 0 06-02-12 03:00 AM

All times are GMT -5. The time now is 12:22 AM.


Powered by vBulletin® Version 3.7.1
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Copyright 1998 - 2014, nV News.