Reply
Kilobyte
cowtub
Posts: 14
Registered: ‎01-23-2009
0

Re: Reallocated Sector Count increasing (ST31500341AS )

The thing is, the raw value is vendor-specific (and quite possibly model-specific!). As the Wikipedia page on SMART says: "Each attribute has a raw value, whose meaning is entirely up to the drive manufacturer (but often {my emphasis} corresponds to counts or a physical unit, such degrees Celsius or seconds)". Comparing raw SMART attribute values between different vendors/models isn't a valid comparison, unless one KNOWS what the raw value actually represents. One manufacturer may choose a raw value for 'Seek Error Rate' to reflect the total number of seek errors, whilst another might use 0 to signify that it's within expected bounds for the number of IO operations so far performed by the drive.

 

Without specific design/firmware information, pretty much the only valid way to interpret SMART data is to compare the value with the threshold. If the value of a given SMART attribute is lower than it's threshold, there's a problem. If not, the drive believes itself to be healthy.

Visitor
mykey2k
Posts: 2
Registered: ‎01-26-2009
0

Re: Reallocated Sector Count increasing (ST31500341AS )

Haven't been here for a while but I've been seeing the emails that people have been making this topic popular, so here I am to put closure on this...

 

Got a replacement drive from Seagate... it's a recertified product but I checked the warranty online for the SN and it is good until the end of 2013.

 

When I got the drive I started repopulating all the data I had combined in to one location and ironically filled the drive -- so I guess that was a good test since with the first drive, it started spitting errors out within 2 hours.

 

Since though (it's been about a month I think) I only have received 2 reallocated blocks according to the SMART.

 

When I returned my old drive, it had made it up to the 5000-6000 range. 

 

Thanks though to those that have pointed out that WDC has a 1.5T drive out now.  I'll have to consider that since, though Seagate has satisfied this problem, I don't think I can trust them with my data again especially since the ball is so huge now.... If I lost 100gb no biggie, but 1500-2000gb... now we're gonna have words.

 

-m

 

 

Petabyte
HughR
Posts: 421
Registered: ‎01-01-2009
0

Re: Reallocated Sector Count increasing (ST31500341AS )


HughR wrote:

I've got a ST31500341AS drive with firmware SD1A.  The serial number checker says it doesn't need SD1B.

 

...  the raw number of Hardware_ECC_Recovered is 170,255,562.

 

The same command on a Hitachi Deskstar T7K500 shows no line with ID 195.

 ...

I strongly suspect that  170255562 is not the number of ECC failures.  Google shows some hints that others doubt these numbers (certainly not conclusive).

...

It turns out that this isn't a simple counter of errors.  According to this message, at least some error counts have a different encoding

  http://forums.seagate.com/stx/board/message?board.id=ata_drives&message.id=10709#M10709

 

If the encoding is similar, the bottom 32 bits are a count of reads and the top 16 bits are the actual error count.  So my count would mean no errors.

Kilobyte
cowtub
Posts: 14
Registered: ‎01-23-2009
0

Re: Reallocated Sector Count increasing (ST31500341AS )

Nice one, HughR.

 

The theory would seem to hold; I tend to buy pairs of discs and have a RAID mirror for the Linux system, but boot another OS without any RAID mirroring. Therefore, the first drive tends to perform slightly more reads than the second, and sure enough, the ECC Recovered value is higher on the first drive of two pairs of otherwise identical Seagate discs.

 

This perfectly illustrates my earlier point about no guarantees that raw SMART attributes can be compared between manufacturers, or indeed between models.

Byte
ChicagoLawyer
Posts: 6
Registered: ‎01-03-2009
0

Re: Reallocated Sector Count increasing (ST31500341AS )

So, if that is correct, what I need to do is convert to binary, take top 16 bits, and convert back to decimal to get actual error rate?

 

I still think something is going on.  Will try that and see. 

Petabyte
HughR
Posts: 421
Registered: ‎01-01-2009
0

Re: Reallocated Sector Count increasing (ST31500341AS )


ChicagoLawyer wrote:

So, if that is correct, what I need to do is convert to binary, take top 16 bits, and convert back to decimal to get actual error rate?

 

I still think something is going on.  Will try that and see. 


We don't know (1) if this is correct (it does seem likely) and (2) if the division point is the 32-bit mark.

 

But if it is, the easiest thing is to divide the count by 2^32 and take the integer part of the result.  Since none of the numbers reported has been over 4294967296 (if I remember correctly), I think that you will see 0 as the result.

 

Just for fun, someone could drive the read count up until it overflowed.  That would show the actual modulus.  I don't actually know what counts as a read.  Does it have to come from the platter or are transfers from the buffer counted?  Does a multi-block read count as one or as one for each block?  I'm too lazy to do that research at this time.

Byte
rardin
Posts: 8
Registered: ‎01-27-2009
0

Re: Reallocated Sector Count increasing (ST31500341AS )

It sounds like we may be on the way to understanding the high error counts.  But this doesn't explain the reallocated sectors.
Regular Visitor
olafd
Posts: 1
Registered: ‎04-06-2009
0

Re: Reallocated Sector Count increasing (ST31500341AS )

i have 5 of these drives in a 5-bay NAS running as RAID5 array.

tray 1 has currently 25 reallocated sectors

tray 2 has 1 reallocated sector (just noticed that now, 2 hours ago it was still 0)

tray 4 has 1 reallocated sector

 

the other 2 drives show 0 reallocated sectors.

 

total hours to date are about 970.

 

while searching the net for possible causes i found this kinda funny clip:

http://channelsun.sun.com/video/shouting+in+the+datacenter/6160269001

showing how vibration can affect drive performance.

 

the article that linked to the video actually mentioned that vibrations might be the cause of failing raid disks.

it seems that the 1.5TB drives have such a high data density that even light vibrations may cause the drive head to fail to properly position itself sometimes, leading to time-outs which can cause the RAID system to drop the disk.

if that is really the case, i wonder if a drive that can't cope with the vibration its exposed to, might just give up on sectors and keeps reallocating them...

 

the 1tb server rated drives (ST31000340NS) are supposed to handle these vibrations much more graceful and are hence better options for RAID systems. but ditching my 5 current 1.5TB drives for 1TB lower capacity drives would really hurt, not to mention the sunk cost... :smileysad:

 

the actual article is here (sorry, it's in german):

http://www.heise.de/newsticker/Ausfaelle-bei-Seagate-Festplatten-durch-Firmware-Probleme-Update--/meldung/121822 

Byte
rardin
Posts: 8
Registered: ‎01-27-2009
0

Re: Reallocated Sector Count increasing (ST31500341AS )

It's possible that vibration may cause latency issues, but I'm more skeptical about it being the source of the reallocated sector problem.  I have two ST31000333AS drives that have non-zero Reallocated_Sector_Ct.  One of the drives was purchased in retail packaging, the other in OEM packaging.  The retail drive was installed in a fanless external case, so the drive itself would've been the source of any vibration.  The other ST31000333AS was installed as the lone hard drive in a Power Mac which seems to be a relatively vibration-free environment, as well.

 

I'm still testing the 1.5 TB WD drives I mentioned earlier.  It appears that the story won't be perfect in this case, either, but probably much better than my experience with the ST31500341AS.  I hope to have more details to pass along next week.

Byte
Jay.1
Posts: 14
Registered: ‎04-06-2009
0

Re: Reallocated Sector Count increasing (ST31500341AS )

[ Edited ]

Should I be worried?  Initially I had 2 pending bad sectors:

 

 

197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 2
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 2

 

Seatools failed, too.  Is it normal for Seatools to fail with bad sectors, or does it give some kind of helpful error message when it encounters them?

 

 

Sequential Test 55 % complete on drive /dev/sg0
Sequential Test 60 % complete on drive /dev/sg0
VERIFY failed on block 199871488 Sense data = 11/00/00
TEST FAILED at block 134574880 on drive /dev/sg0

 


"Defective drive" and "Corrupted format" errors (sometimes called 03/31 errors) can often be repaired with a data-destructive zero fill data pattern or a low level format. Current disk drives contain thousands of spare sectors which are automatically reallocated if the drive senses difficulty reading or writing. Since SeaTools is read-only (data safe), occasionally a problem sector that has not reallocated to a spare sector can be forced to do so by writing to that sector. Spare sector reallocation is a normal intelligent drive operation.

 

Then after I completely wiped the drive (writing to every sector), the 2 pending sectors disappeared, and reallocated number did not go up.  Now SeaTools passes.  Does this just mean it decided those sectors were ok after all?  Do I have anything to worry about?

5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0

 

Message Edited by Jay.1 on 04-10-2009 07:44 AM