HDSentinel seems overly cautious on high endurance SSDs

Experiences with hard disks, SSDs, USB devices, hard disk controllers, motherboards and so.
nabsltd
Posts: 4
Joined: 2016.10.31. 22:38

HDSentinel seems overly cautious on high endurance SSDs

Post by nabsltd »

I have a few 480GB Intel DC S3610 drives with about 650TB written. The Intel (now Solidigm) utility (https://www.solidigm.com/us/en/support-page/drivers-downloads/ka-00085.html) reports the drive health at around 86%, which seems reasonable, since the drive is rated at 3.7PB lifetime writes (https://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/ssd-dc-s3610-spec.pdf).

However, because there are 10-20 reallocated blocks on a few of the drives, HDSentinel reports the "health" at 27%. But, 15 reallocated blocks on these drives is only about 1% of the total spare blocks (which can be seen from the SMART data). With the total drive writes at about 18% of lifetime, plus the reallocated blocks well below 18% of the total, I think the overall "health" would be much closer to 80-85% than the 27% reported by HDSentinel. OTOH, one of the drives has only 2 reallocated blocks, and on that drive HDSentinel reports 86% health, even though it has around the same 650TB written.

When calculating "health", does HDSentinel use the absolute value of reallocated sectors, or the relative percentage compared to the total amount of initial spare sectors?
User avatar
hdsentinel
Site Admin
Posts: 3019
Joined: 2008.07.27. 17:00
Location: Hungary
Contact:

Re: HDSentinel seems overly cautious on high endurance SSDs

Post by hdsentinel »

Sorry to say, but no, Hard Disk Sentinel is not overly cautious - except if you configure specifically to work that way.

While generally yes, Hard Disk Sentinel designed to detect and report any single problem (exactly to increase attention) 15 reallocated blocks would never result so low Health: 15-20 reallocated sectors (or blocks, depending on the SSD) would result 80% or higher Health % value.

According what you wrote, probably you configured Hard Disk Sentinel to work as very strict: to count and report every problems (including the reallocated sectors) so seriously, to decrease the Health % so dramatically.

Please check the Configuration -> Advanced options page. Probably the Health calculation method option is now set to
Analyse data field (more strict, recommended to servers)
instead of the default
Analyse data field (default).

Please set back that option to the Analyse data field (default) setting. Then the Health % value will immediately increase back to the high range. The graph on the bottom will show the lower value today (as it designed to show the daily lowest value) but from tomorrow it should increase back as well.

The purpose of this option is EXACTLY to dramatically decrease the Health % on any (even minor) problems - so Hard Disk Sentinel probably did exactly what you configured.

As always, I suggest to use Report menu -> Send test report to developer option.
That would confirm the above - and it is possible to check the actual situation: seeing developer reports (not images, screenshots, not txt/HTML user reports) of SSDs with
- high amount of written data in its lifetime
- lower Health % caused by wearout
- bad sectors
always give details about specific models / firmware versions - and always help further development and possible fine-tune of the Health % calculation.


> When calculating "health", does HDSentinel use the absolute value of reallocated sectors
> or the relative percentage compared to the total amount of initial spare sectors?

This depends on lots of factors: the SSD interface/type/model/manufacturer/firmware and so.
Some SSDs report the actual count of bad sectors - but not the spare size. Sometimes it is possible to (at least) estimate the spare size - but sometimes we have no such information.
Other type of SSDs do not count/report the actual count of bad sectors, just the percentage of the available/used spare area.

This is exactly why the developer-reports can help: to check if there may be any further changes, adjustments possible.
nabsltd
Posts: 4
Joined: 2016.10.31. 22:38

Re: HDSentinel seems overly cautious on high endurance SSDs

Post by nabsltd »

It was the "more strict" setting that caused the dramatic difference in the health report. This was a fresh install (on a new machine) of portable version 6.01, and I didn't change any settings, so it's odd that it was enabled. Still, I think that HDSentinel could benefit from a change to the way health for SSDs is calculated.

HDSentinel is absolutely the best program available for giving human-readable information about the health of a hard disk. When it shows yellow (or worse, red), the user knows to pay attention and do something fairly quickly. This is because HDSentinel understands that a few bad/relocated sectors on a spinning rust disk is a sign that the disk is starting to degrade, and once it starts, it can happen fairly quickly. And, when a sector is bad, because of the sequential nature of the storage on the disk, it can lead to nearby sectors going bad.

But, SSDs are completely different. Unlike a spinning disk, SSDs don't actually have "spare sectors". Yes, they have extra flash beyond the usable size of the drive, but that flash is used from day one, allowing the drive to spread writes across more cells to allow wear leveling to avoid one cell being written a lot more than another. And, it is completely expected that a cell will wear out and be retired. This is displayed as a "reallocated" block in SMART, but it isn't, really. It's just not using the worn out block any more.

And, unlike a spinning disk, a retired block on an SSD doesn't generally lead to nearby blocks also becoming bad. It can happen, but usually retired blocks are generally just blocks that are on the statistical lower end of the estimated number of writes. Unless the number of retired blocks is very high compared to the total number of extra blocks available when the drive was built and the total number of written blocks is low compared to the total estimated lifetime writes (which can usually be calculated by the estimated remaining life provided by SMART), the drive is functioning perfectly and should display as green, even with a more stringent "server mode" algorithm.

Although I haven't done the research into how much "spare" space spinning disks contain, I suspect the percentage is far lower than even the most consumer of SSDs. Spare blocks on an SSD range from about 3% to 20%, depending on the kind of endurance that the drive is rated for.

Last, on a spinning disk, if you are using less than the full drive by limiting the partition(s) to not use the full space, it doesn't make the drive last any longer. In fact, by using the same sectors over and over again, it might make it wear out faster. OTOH, using only 50% of an SSD for active data can increase the lifespan by a factor of 10 or more. I'm pretty sure HDSentinel doesn't take into account the total partition size on the disk when calculating SSD health.

Essentially, the 15 or so retired blocks I saw on my 480GB SSDs were of no consequence regardless of how careful the user is, while 15 bad/relocated sectors on a 480GB spinning disk should cause the kind of change in displayed health that I saw.
User avatar
hdsentinel
Site Admin
Posts: 3019
Joined: 2008.07.27. 17:00
Location: Hungary
Contact:

Re: HDSentinel seems overly cautious on high endurance SSDs

Post by hdsentinel »

Thanks for your message, good to hear that after the option switched back to the default, the Health % greatly improved.
If I'm correct, now you see 80% or higher Health which means the SSD probably reported as GOOD (displayed as green), so it does not indicate that the drive is about to fail, but still "low" enough to confirm that the drive had minor issues and further degradation is possible.

Thanks for your opinion, generally this is exactly why I wrote that the developer-reports can help: to check if there may be any further changes, adjustments possible for specific models.

Generally seeing the status of
- SSDs with bad sectors
- SSDs where the usage is relatively high
always helps lots, so I recommend to please use Report menu -> Send test report to developer option, it would help to check. Thanks!

Thanks also for your kind words ;) Really appreciated ;)


> SSDs don't actually have "spare sectors".

Sorry to say, but in general this is not really true in general. Many SSDs still have spare sectors, and some of them even provide the amount: for some SSDs we may notice

Unused Reserved Block Count (SSD Total)

or similar attribute on the S.M.A.R.T. page. And (similarly as for hard disks) these are not too much, usually in the range of some hundreds - some thousands (I mean very low compared to the capacity of the SSD) so can fill up relatively quickly.


> Yes, they have extra flash beyond the usable size of the drive, but that flash is used from day one,
> allowing the drive to spread writes across more cells to allow wear leveling to avoid one cell being written a lot more than another.

Yes, that's true.


> And, it is completely expected that a cell will wear out and be retired.

Yes, that's true. But if this would happen similarly - then we'd see SSDs with perfect status - and than many sectors would fail at the same time (at the end of the lifetime).
But as you can see with your SSDs some sectors can fail while MOST of the SSD is still working perfectly.


> This is displayed as a "reallocated" block in SMART, but it isn't,

Sorry, but it IS a reallocated block. This is why the name of the attribute is "Rellocated", "Reassigned" or similar - which means that the spare area is used now. This is different from the over-provisioning area you may mean.


> a retired block on an SSD doesn't generally lead to nearby blocks also becoming bad.

Sorry, but this is not really true. Generally on SSDs sectors are grouped to bigger units (blocks/pages) and if one sector fails, a complete block (so generally a bigger unit) may fail too and the amount of data may be damaged/lost is generally higher. This is why we can't simply ignore reallocated sectors on SSDs.


> It can happen, but usually retired blocks are generally just blocks that are on the statistical lower end of the estimated number of writes.

No... The block can fail even if it did not yet reach the end of its lifetime (no need to wait 1000's of write cycles for that).


> the drive is functioning perfectly and should display as green, even with a more stringent "server mode" algorithm.

Maybe by the "server mode" algorithm yes, can be changed to show slightly "higher" Health, but generally that "server mode" setting designed exactly to calculate any (even minor) problem seriously to increase attention.


> Although I haven't done the research into how much "spare" space spinning disks contain,
> I suspect the percentage is far lower than even the most consumer of SSDs.

No... The spare area is generally similarly low on hard disks and SSDs, compared to the total capacity of the drive itself.


> Spare blocks on an SSD range from about 3% to 20%, depending on the kind of endurance that the drive is rated for.

Sorry, but no, this is not true...


> I'm pretty sure HDSentinel doesn't take into account the total partition size on the disk when calculating SSD health.

You're absolutely correct: the partition size is NEVER ever used to calculate / determine the SSD health.
That would be a TERRIBLE idea as then re-partitioning would cause different Health % display for no reason.

Generally the Health % on any SSD determined by two different factors:

- amount of problems (for example but not limited to bad sectors)
- wearout of the memory cells (calculated by the SSD itself, not by Hard Disk Sentinel)

Generally on a hard disk drive, you can create a partition to "avoid" using a problematic area, for example sectors near an original bad (I agree that it is not the best option in all cases, especially if the drive has lots of problems, so the Health % is generally low - but it is a different question). But with SSD, you can't control WHERE the data actually written. So if you'd know that the SSD has bad sectors, you simply can't make a partition to avoid using a memory cell / block or so.

If the SSD would be otherwise perfect (assume that there is no bad sector at all), the Health % decrease is still normal, caused by the wearout of the memory cells. This wearout is NOT calculated by Hard Disk Sentinel, but calculated by the SSD itself, based on the utilization (mostly by the amount of data written, but the "type" of writes is also important: frequent modification of very small files causes bigger wear due to the more frequent program/erase cycles). Hard Disk Sentinel "just" reads and displays the Health and mentions that in the text description, as explained at

https://www.hdsentinel.com/kb/category/16/solid-state-drives-ssds/why-my-ssd-shows-98-health-if-no-problems-reported.html


A side note: on the S.M.A.R.T. page, you can use the Offset value at the corresponding attribute to "acknowledge" the problems and be notified about new problems only.
The page
https://www.hdsentinel.com/faq_repair_hard_disk_drive.php
explains that for a hard disk drive, but it is generally similar for the SSD too.
Post Reply