mdadm software raid problems on Debian
I have a server running Debian and did have a software raid1 array of 2 x 250GB SATA disks on a Sillicon Image 3112 onboard controller. I am running the latest 2.6 kernel for reference. About 3 months ago I had a disk fail (was marked faulty by mdadm monitor) with ATA ioerrors and then about 30 minutes later the other disk started to have the same errors. I shut the machine down and was able to recover the info from the drive which hadn't been marked faulty. I figured it wasn't out of the realms of possiblility of them both going wrong since they were the same drive, from the same manufacturer and had serial numbers almost identical.
So, not trusting them, I purchased 2 raid edition drives (300GB ones this time) one seagate and one maxtor and recreated my array using these new drives. They proved faster, quieter and were also put in IcyDocks, to keep them cool.
Last night the EXACT same thing has happened to the drives again. This to me is far too much of a coincidence.
My plan is to fire up the server with the disk marked faulty removed and copy the contents off onto another machine on the network. Then I can play around with things.
I obviously need to find the problem. When the original problem occurred I was using Debian Sarge, I switched to Debian Etch in the hopes that newer programs and drivers would solve the problem but alas it hasn't. I can only guess that it's either the motherboard, or Debian itself which is having problems.
Anyone else have any suggestions as to the cause and a solution?
Thanks guys. (and girls)
AMD64 FX55 / Dual XFX 6800U / Asus A8N-SLI Deluxe / Dell 2001FP 20.1" LCD
Corsair TWINX2048-3200C2 / Thermalright XP-120 / PCP&C 510 SLI /
Maxtor DM10 250GB SATA NCQ / NEC 2500A DVD+-RW / Sony DDU1613 DVD-ROM