Comment 167 for bug 438136

Revision history for this message
In , cowbutt (cowbutt6) wrote :

(In reply to comment #1)

> The reason I picked log2() here is simply that we do want to allow more bad
> sectors on bigger drives than on small ones. But a linearly related threshold
> seemed to increase too quickly, so the next choice was logarithmic.
>
> Do you have any empiric example where the current thresholds do not work as
> they should?

According to http://www.seagate.com/ww/v/index.jsp?locale=en-US&name=SeaTools_Error_Codes_-_Seagate_Technology&vgnextoid=d173781e73d5d010VgnVCM100000dd04090aRCRD (which I first read about 18 months ago, when 1.5TB drives were brand new), "Current disk drives contain *thousands* [my emphasis] of spare sectors which are automatically reallocated if the drive senses difficulty reading or writing". Therefore, it is my belief that your heuristic is off by somewhere between one and two orders of magnitude as your heuristic only allows for 30 bad sectors on a 1TB drive (Seagate's article would imply it has at least 2000 spare sectors - and maybe more - of which 30 are only 1.5%).

As you say, though, this is highly manufacturer- and model-dependent; Seagate's drives might be designed with very many more spare sectors than other manufacturers' drives. The only sure-fire way to interpret the SMART attributes is to compare the cooked value with the vendor-set threshold for that attribute.

If you are insistent upon doing something with the raw reallocated sector count attribute, I believe it would be far more useful to alert when it changes, or changes by a large number of sectors in a short period of time.