01-19-2010 10:38 AM - edited 01-19-2010 11:10 AM
I've got a 7200.12 drive (750G) that I bought, two actually, and want to use in a raid5 setup i'm making, but there seems to be a problem with one of them. When copying files to the new array suddenly the drive sometimes shuts down, causing problems. When this happens I can hear the drive spinning down and sometimes it tries to spin up again a few times or something similar. I'm using ZFS, and the drive is marked as FAULTED. I can reboot and it will be online again for a while usually, but eventually it will always run into problems.
I'm having trouble getting it to work on other computers at all, partially because I think it doesn't always work in the first place. I tried the Windows SeaTools on XP but it kept saying 'Test Unavailable', so I tried both the boot ISO's but both died on me with an 'Invalid Opcode' error so that was no use either. Lastly, I tried smartctl, the only application to test the drive's SMART status that I can use, and it does report a PASS on both drives but when I instruct the drives to do a long self-test (both drives, for comparison) it has a read-error around 80% consistently (the other one passes with no errors). However, the SMART status is still PASS. The program indicates the life of the drives is around 580 hours (good as new).
Is it possible for a drive to sometimes fail without reporting anything to SMART, or is this drive just not failing and the problem somehow lies elsewhere?
01-19-2010 04:32 PM
Generic PSU with a rating under 400w.
Probably isn't it, but crud PSU's do fluctuate the power, hence spin down.
01-19-2010 10:50 PM
Thanks for your reply.
I did think of that... the machine has 7 drives connected, but it's a 400W Cooler Master PSU, which I believe is a somewhat respectable brand for PSUs. I have tried different configurations of the disks on the PSU, making sure the amount of drives per separate rail is balanced. The last thing I'm thinking of trying is removing 3 drives from the machine and trying to run an endurance test of some sort to see if there are still problems.
The only thing I've noticed from smartctl is one 'unknown attribute' is halfway from the normal value to the threshold, so it could be that the disk is having problems but it doesn't reckon they are severe enough to warrant a SMART fail. However, I've read the doc about Seagate and SMART attributes and third party apps... so I take it with a grain of salt. I guess time will tell, if this is the case. Meanwhile I'll have to hope one of the other two drives in the array doesn't fail.
03-02-2010 05:41 PM