when to decommission HDD?

How, what, where and why - when using the software.
VGER
Posts: 31
Joined: 2010.08.18. 17:35

when to decommission HDD?

Post by VGER »

Hello,

back in the "old days" when you first noticed a single defective sector on a HDD, you immediately did your best to replace the drive as soon as possible. Reason being that when there was a single defect on the surface, there could well be loose particles inside the case bouncing around and generating a lot more defects very soon.

Is this rationale still valid with today's disks? For example, I have here a standard SATA disk that reported one weak sector via SMART data. HD Sentinel promptly notified me about the issue, whereafter I took the disk offline and performed a "Surface Test". The test returned the "pending sectors" count back to zero.

What do you think: Is my instinct to still discontinue using the disk justified, or am I overreacting?

Or to put it differently: Did you ever continue using a drive in this condition, and what were your experiences from it?

Regards
User avatar
hdsentinel
Site Admin
Posts: 3019
Joined: 2008.07.27. 17:00
Location: Hungary
Contact:

Re: when to decommission HDD?

Post by hdsentinel »

Generally yes: mechanical damages can cause more and more problems with time.
However, one single problem may not immediately indicate that replacement required. Some, small amount of issues can be acceptable as described at in the Support -> Frequently Asked Questions
https://www.hdsentinel.com/faq.php#health
( I have bad sectors and my disk health is 90%. Do I need to worry or ask for replacement drive? )

When having higher number of bad sectors (100's, 1000's or even more) then yes, it is very good chance that the surface will degrade - resulting even lower Health %.

We all need to consider
- the value of our data
- the current Health % (the number of problems)
- the age/power on time of the drive, as it may reach the end of the designed lifetime

Generally the Estimated remaining lifetime (as designed to consider the status and power on time too) can help to estimate when we should plan replace the drive in "normal" (not mission critical) conditions (assuming that there is a backup and some minor downtime acceptable due to exchange / restore).

In mission critical environments yes, I can imagine that the drive immediately replaced when the first problem detected (some of our clients do that) if the value of the data is high and/or a possible unwanted downtime is not acceptable.

Howerver, weak/pending sector(s) usually not as serious - and usually not really related to the disk drive itself: as described on
https://www.hdsentinel.com/hard_disk_ca ... ectors.php
these may be more related to cables / connections or even caused by a sudden power failure / reset. In this case, the weak/pending sector(s) can be repaired: usualy easily by the Disk menu -> Surface test -> Disk Repair or more quickly by the Disk menu -> Surface test -> Quick Fix test and the drive can continue working without issues (assuming that the original problem eliminated - otherwise a bad cable/connection/power supply can cause more and more weak sectors in the future - even on a new/replacement disk drive too).
VGER
Posts: 31
Joined: 2010.08.18. 17:35

Re: when to decommission HDD?

Post by VGER »

Hello,

That reference about the "weak sector" workings was very interesting. I didn't know that a "weak sector" was not the same as a "bad sector" (which would lead to an inevitable "reallocation event".)

This appears to be my case exactly: I had a report of a single weak sector (SMART page in HDS: 1 "pending sector"). A HDS "surface test" restored the disk health to 100% with still zero "reallocation events".

At about the same time, I got an alert about another drive, but in this case not via SMART, but Windows Event Log "block error on disk x" (Event ID 7, source "Disk".) At the time, the computer was running with its case open and cables running in all directions. It's easy to believe that some poorly shielded cable just had a random data error.

The disks are all part of a RAID (-6, e.g. two failures accepted). All things considered, I think I can afford the luxury to wait and see if the error reoccurs before replacing the drive.

Thanks for the tips!

Regards
Post Reply