Reply
Yottabyte
fzabkar
Posts: 4,661
Registered: ‎01-27-2009
0

Re: Further update to the 7200.11 issues

axelkloth, with respect, you've just shown me another case where the author of SMART software got it wrong (unless there are very serious firmware bugs).

Could you indulge me just a little longer by giving me the same report in hexadecimal (using HD Sentinel)?

Notice that the Read Error Rate is 1037904688, but Hardware ECC Recovered is 7904688, ie the last 7 digits are identical. Normally these raw values are completely identical, and that is because they are sector counts, not error counts.

Furthermore, the raw Seek Error Rate value shows only the lowest 32 bits, not the full 48 bits. Once again the displayed value is a sector count, not an error count. I expect that the upper 16 bits will show an error count of approximately 100.
Kilobyte
axelkloth
Posts: 43
Registered: ‎10-02-2009
0

Re: Further update to the 7200.11 issues

fzabcar: I agree that the results from SMART don't jibe, not even across the record. However, there is clearly something wrong with the disks. I got at max 200 byte/s to and from the disks using whatever metrics I chose. So while SMART data may be wrong or misinterpreted, the disks are beyond disastrous. The disks and the computer are back with its owner, and knowing his temper, he took a sledge hammer to the entire thing. So in short, it's highly unlikely that anyone will get anything off anything of the entire assembly - and I to some degree can't fault him for that. I will - however - get HD Sentinel in case a Seagate disk comes across my disk again. I have spent 4 hours for no pay whatsoever. I can buy several dozen terabytes worth of disks from other manufacturers for that.

Kilobyte
axelkloth
Posts: 43
Registered: ‎10-02-2009
0

Re: Further update to the 7200.11 issues

actually, I have to agree with you. It is not really believable that the seek error rate is lower than the read error rate - unless per seek many bits go bad without being corrected. I think that this is another issue of firmware doing something wrong. Clearly, Seagate can't even collect statistics. And wrt ECC errors corrected versus read error rate, unless things get counted in BCD it's impossible that the lower seven digits match, and the higher ones don't. The difference is not a power of two. And then again, even if it's true, it would result in close to 160 read errors per second. I know that my friend did video cutting and stuff like that, but that number is awfully high. These numbers don't go together, and I don't think it's the soiftware that reads and interprets the SMART data. Water under the bridge.

Petabyte
HughR
Posts: 421
Registered: ‎01-01-2009
0

Re: Further update to the 7200.11 issues

 


axelkloth wrote:

 Water under the bridge.


 

Right.

 

When doing recovery, you've got to decide how much effort to put in and when to stop.

 

I take a cautious approach: don't jump to conclusions, don't burn any bridges.

 

As fzabcar says, the raw SMART numbers don't mean what they appear to mean.

 

The best way of recovering, in many cases, is as fzabcar said in a still earlier message: clone the disks and try to recover from the clone.

  1. The wear and tear is then on the replica, and so won't further damage the precious original.
  2. sequential copying (for cloning) is the fastest and least stressful way of accessing a drive.
  3. any logical problems with the filesystem or striping can be dealt with on the replica (possibly accident prone!).

When things like this occur, ones first reaction may be dangerous.  Pause and think first.  Preferably with the machine in a rest state of some kind.

 

Perhaps too late for this system: you (plural) seem to have given up.

Kilobyte
axelkloth
Posts: 43
Registered: ‎10-02-2009
0

Re: Further update to the 7200.11 issues

Hugh: I agree, that is what I should have done. However, believe me, when you want to do a sector by sector copy, and all you get is ~200 byte/s, it's easy to get frustrated on a 1 TB that is nearly full. The system is out of my hands, and if my friend has not taken a sledge hammer to it, I would have within the next few hours. Long story short, whether the SMART data was interpreted correctly or not (and to me, it seems as if it was not), the disks were sufficiently uncooperative that I really could not get anything off of them. Read errors so frequent that the net data rate is at or around 200 byte/s makes it really no fun. I wonder if the real cause was ever found for these disks to be so iffy.

Visitor
Makinbacon
Posts: 1
Registered: ‎06-09-2011
0

Re: RAID problems with 7200.11 drives: unified thread

I have researching on doing a RAID-5 for a media system for a while as my current motherboard does not support RAID-5. After my research I have discovered that this thread may get deleted after the following. Starting with Western Digital I believe... There has been an implementation of a so called fix called TLER(Time-Limited Error Recovery), Segate calls theirs ERC(Error Recovery Control), and Samsungs(Command Completion Time Limit).  This so called fix (as said by the Wiki) was to make drives play nice in a RAID system. I have always bought Seagate it has been one of the most reliable drives I have used, I just may be lucky. In my opinion this is a marketing scam to mark up drives in price. Separating home and business. If you want a drive to play nice in any redundant RAID such as RAID-5 or 1 you have to buy enterprise drives they come with this error correction. So non-enterprise drives will work for a short time but then a drive will fall out of the RAID and some controllers will claim a drive faulty, when its not. So if you have this issue, get enterprise drives or do a RAID-0. I feel stuck as I wanted to do a redundant array for a media server but I cannot. I miss the days when new bug fixes didn't cause separated sales and markups for intentional gain... For more info on TLER- http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery

I know this post is old but I posted for future reference. Seagate knows of this as does other manufacturers. Don't blame support for problems as they read their info from a screen all they know is what a company lets them know. Unless they are uber support and research issues like I do.

Kilobyte
axelkloth
Posts: 43
Registered: ‎10-02-2009
0

Re: RAID problems with 7200.11 drives: unified thread

Makinbacon: I think that the timeout had something to do with the fact that the drives did not play nice in RAIDs. However, the same applied for the "enterprise grade" ES.2 of the same generation. Therefore, while Seagate may have tried to distinguish home use from enterprise use, they failed in doing so. On top of that, these were not the only issues the drives had. While mine failed to initialize and therefore I never lost data, a friend of mine was not so lucky. His set of three failed catastrophically with the "click of death" with one disk for the OS and a striped set for user data, all of them full to the brim. Consequenlty, I sort of doubt that the increase of a timeout value alone would fix their behavior. But even if it did, I am not sure I'd go back to Seagate. We now have close to 200 non-Seagate disks running, with no issues whatsoever, 24x7, some of them for over 18 months. One shows some weakness in its SMART record, but no operational failure.

Visitor
daniel_k
Posts: 1
Registered: ‎08-29-2012
0

Re: RAID problems with 7200.11 drives: unified thread

[ Edited ]

After years, the issue was finally solved (at least for me), with the new 11.5.x.x Intel Rapid Storage Technology drivers.

 

The following symptoms no longer occur:

 

- System freezes momentarily

- Command Timeout SMART attribute increasing

- iaStor Event ID 9 errors in Event Viewer

- RAID array suddenly disappears