HDD Surface Test

How, what, where and why - when using the software.
smallisland
Posts: 4
Joined: 2013.01.30. 21:00

HDD Surface Test

Post by smallisland »

Can you provide a little more information regarding HDsentinel and the different surface tests? I like to verify new disks and recertify used disks before placing them in operation. I was using a utility that came with Soft Raid on a Mac before recently purchasing a few HD sentinel licenses so I could have a windows based utility.
According to Soft Raid, that utility writes random patterns on the disk surface and then performs a read. Each pass consists of a write/read operation. I assume in a sequential manner but they do not specify and there is no map. The last pass always writes zeros so that any future recovery attempt will supposedly be easier to accomplish. For new disks I always ran 3 passes and for old disks I ran 8 because that is what they suggested in their documentation.
I am a little confused by which write-read test I should use in HDsentinel to ceritify new disks and recertify old disks. I would think random patterns on all passes except the last one would be the best surface test but I have no real knowledge on the subject.
I recently ran a reinitialize test on a new disk at level 3. I thought it would make three separate write/read passes but it appeared to write to each block 3 times and read it once.
I then looked for a way to run a 3 pass write/read test where the first 2 passes wrote random data and the last one wrote zeros but I could not find a way to set that up. I just now started a 1 pass random write/read and will follow that with a one pass write/read that writes zeros.
I read through the documentation a couple of times but maybe you can explain which options are best for certifying disks and why. Any help and clarification would be appreciated.
User avatar
hdsentinel
Site Admin
Posts: 3128
Joined: 2008.07.27. 17:00
Location: Hungary
Contact:

Re: HDD Surface Test

Post by hdsentinel »

Thanks for your message and question.

The test you mentioned that used Disk -> Surface test -> Reinitialise Disk Surface is the best method for this purpose.
It uses special initialisation pattern and the inverse of that during the initialisation passes (which can be configured in the "Surface reinitialisation level" exactly as you could see - by default it is 3). After these, the test performs an overwrite with zeroes and then read the corresponding sector back to verify its integrity.
Of course the complete test performs overwrites without using disk cache - to make sure that all sectors overwritten n+1 times (where n = Surface reinitialisation level) before reading back.

As described in the Help, this is a Navso P-5239-26 standard data destruction method - prevents any kind of data recovery.
Of course if you prefer, you can increase the "Surface reinitialision level" (to overwrite 8 or even more times). The 8 overwrite passes (just for data destruction) may not be required on current modern hard disks (for older MFM hard disks you may find recommendations about 35 (!) overwrite passes) as the level 3 reinitialisation (plus the clearing with zeroes) completely prevents any kind of data recovery.

Please note that you can click any time on the disk surface map to verify the contents of any disk sector. This way you can check the original contents (which will be overwritten) and the already overwritten sectors.

As you could see, when you open Disk -> Surface test, you can use various other tests if you prefer and configure how they should work. For example, you may use Disk -> Surface test -> Read + Write test, to overwrite the hard disk (first pass) and read the sectors back (second pass) to verify the surface integrity. This performs overwrite only once - but you can configure to Repeat Test (on Disk -> Surface test and enabling the option on Configuration page just before starting the test). There you can specify overwrite patterns (zero, sector numbers, random data) to be used.

Also on the "Configuration page" in the new window (after selecting Disk -> Surface test) you can specify how the surface should be overwritten, to perform a linear overwrite or by testing the surface in random blocks. You can check the small images to visualize how the test would run.

Some companies prefer the following to re-certify drives:
Disk -> Surface test -> Enabling "Random Test" on Configuration page (in addition to the "Sequential test") and starting the Reinitialise Disk Surface test.
This would immediately perform a "double" testing (total 8 overwrte patterns, if "Surface reinitialision level" is the default 3).
Also it performs further stressing on the drive as in the 2nd pass it performs random order testing - verifies not only the surface but also the seek/servo element.

Usually drives run better after this procedure as the drive re-allocates and stabilizes all sectors (if required) and any kind of data recovery is impossible.

Hope this helps ;)
smallisland
Posts: 4
Joined: 2013.01.30. 21:00

Re: HDD Surface Test

Post by smallisland »

Thank you for the clarification.

I now understand that the Disk -> Surface test -> Reinitialise Disk Surface level adjusts the number of overwrites but not the number of reads. However, a separate read is performed for each sector order test type (sequential, random, etc) along with the specified number of writes per level plus 1 for the last zero write sequence.

So in the example you gave that is used by some companies to recertify disks, 8 total write operations would be performed (6 with a special pattern and 2 with zeros) along with 2 read operations (1 for each sector order test type).

That really does explain things better than your help file.

I still see no way to perform 2 random write passes with random data, and one sequential pass with zeros using the normal write-read test because the data type selection for the write operation can only be set globally for each test. Reinitialise surface is obviously very close to the same thing without extra read passes. For the write - read tests it would be nice to be able to set up the data type for each write pattern and specify which test will occur first, second, and last. That would make customised test sequences easier without having to run separate tests although maybe that level of customization is not needed.

I suppose I could adjust the reinitialise level to 1 but repeat reinitialise disk surface 2 times with sequential and random boxes checked. Or even set the level to 1 with sequential, random, and butterfly sector order boxes checked for a total of 6 writes and 3 reads leaving the disk zeroed at the end of the operation. That would be close to what I was doing on the Mac although it sounds like your suggestion would produce a more than acceptable level of trust in any disk as well. So if I select sequential, random, and butterfly sector order tests what order are they performed in? The order listed?

When I mentioned data recovery being easier, it was suggested by the makers of Soft Raid that a zero pattern written on the whole disk surface prior to placing it in operation would make it easier for data recovery services like Drive Savers in the US to recover the data in the event of a drive failure. I understand that data previously saved on the disk would be nearly impossible to recover after your 3 pass reinitialise surface has been performed.

Thanks again for responding. I really like HDsentinel so far and I like how you appear to be constantly moving forward with improvements and updates not to mention your quick response to any questions.
User avatar
hdsentinel
Site Admin
Posts: 3128
Joined: 2008.07.27. 17:00
Location: Hungary
Contact:

Re: HDD Surface Test

Post by hdsentinel »

> I now understand that the Disk -> Surface test -> Reinitialise Disk Surface level adjusts the number of overwrites but not the number of reads.

Yes. The "Reinitialise Disk Surface" test performs one read (after the total number of overwrites with patterns (defined by the level) + with zero.

> However, a separate read is performed for each sector order test type (sequential, random, etc) along with the specified
> number of writes per level plus 1 for the last zero write sequence.

Yes, exactly as you wrote.

> So in the example you gave that is used by some companies to recertify disks, 8 total write operations
> would be performed (6 with a special pattern and 2 with zeros) along with 2
> read operations (1 for each sector order test type).

Yes, exactly as you wrote.


> I still see no way to perform 2 random write passes with random data, and one sequential pass
> with zeros using the normal write-read test because the data type selection for
> the write operation can only be set globally for each test.

Yes, this is true. However, you can do it any time to manually launch the tests as you prefer, for example:
Disk -> Surface test -> Write test , select "Repeat test" 2x times, select "Random data" on Configuration page before starting the test, and then start
Disk -> Surface test -> Write + Read test (to fill with zero and then read back the contents to verify if all sectors are both read-able and write-able)

As you can see, there are really flexible testing methods available to configure and run - and the "Sector order" also adds an other level of flexibility.

> Reinitialise surface is obviously very close to the same thing without extra read passes.
> For the write - read tests it would be nice to be able to set up the data type for each
> write pattern and specify which test will occur first, second, and last. That would
> make customised test sequences easier without having to run separate tests although
> maybe that level of customization is not needed.

Thanks for the tip!
Yes, currently you may start different tests to make this (as mentioned previously). To be honest, personally I did not feel this customization would improve the efficiency of testing, data destruction, force of sector-reallocation in any ways, so I'm not sure if it would be required.


> I suppose I could adjust the reinitialise level to 1 but repeat reinitialise disk surface 2 times
> with sequential and random boxes checked. Or even set the level to 1 with sequential, random,
> and butterfly sector order boxes checked for a total of 6 writes and 3 reads leaving the disk
> zeroed at the end of the operation.

Yes, this may be a very good combination !

Personally to re-certify drives, I'd prefer to increase write passes and reduce read passes
(this is what the simpe "Reinitialise disk surface" test does without any additional options) or as suggested by enabling one other "Sector order" for example the "random order" (as suggested) or even enabling both sequential, random, butterfly options as you wrote - as it would be very intensive testing method.

> That would be close to what I was doing on the Mac although it sounds like your suggestion would produce a more
> than acceptable level of trust in any disk as well.

Yes ;)

> So if I select sequential, random, and butterfly sector order tests what order are they performed in? The order listed?

Yes: these tests run in this order. After the _complete_ sequential test finished, the random order test starts immediately, then the butterfly test starts immeditely. You would see that after the complete surface tested (the surface map turns to green), then it switches back to white, indicating that a new pass started and there you can see how the progress indicator (the small dark blue box) advances in random / butterfly order after the sequential test completes.

> When I mentioned data recovery being easier, it was suggested by the makers of Soft Raid that
> a zero pattern written on the whole disk surface prior to placing it in operation would make
> it easier for data recovery services like Drive Savers in the US to recover the data in the event of a drive failure.

Yes, it is more easy to data recovery compared to having full random data previously on a drive.
Imagine this situation:
- there were two hard disks completely filled with data
- the RAID array created on these drives
- the user starts to fill the RAID array. When it is half filled, the RAID array broke.
Then data recovery companies may need to analyse the sector contents and collect which sector belongs to which file - and the old (random) data stored (before creating the RAID array) may make it harder as then the data recovery company would not know what to search for and they may even try to recover such "garbage" instead of the important data stored on the RAID.
So if the last overwrite is a complete zero fill after testing with different patterns (for example as Disk -> Surface test -> Reinitialise Disk Surface works), this may help.

But - some other data recovery companies recommended a different technique: to fill the drive with the current sector number. This is why this option is also available in Hard Disk Sentinel: you can perform Disk -> Write test to fill all sectors with the corresponding sector number (you can check by clicking on the disk surface map after the write completed).
This would help data recovery even more as (especially with special RAID controllers) the data recovery company would not only see which sector is important (which contains importand data VS all zeroes) but immediately have some help about re-constructing the RAID structure as they would see how the sector number on the logical drive completes.
You may ask your data recovery company about this option - not sure, but may be interesting for them also.


> I understand that data previously saved on the disk would be nearly impossible to recover after your 3 pass reinitialise surface has been performed.

Yes, this is absolutely true.


> Thanks again for responding. I really like HDsentinel so far and I like how you appear to be constantly moving forward
> with improvements and updates not to mention your quick response to any questions.

Me thanks too for your kind words ;) If possible, please share your thoughts on www.facebook.com/HDSentinel with users.
smallisland
Posts: 4
Joined: 2013.01.30. 21:00

Re: HDD Surface Test

Post by smallisland »

I took your advice and started running the quick self test and extended self test along with the Disk Reinitialise routine before placing disks in service.

I have just finished 2 days of surface testing on 4 disks (Disk initialise process, sequential and random sector orders). All 4 disks passed with no issues. I did not run the extended self test on one of the disks, a seagate 1tb drive, before running the reinitialise process. I decided to go back and run the extended self test. To my surprise it came back showing 95 weak sectors and the drive health dropped to 14%.

Why would the extended self test identify weak sectors that the reilitialise surface process could not?

I can understand the opposite happening if there was an issue with the cables or controller but I am really surprised at this result.

Any ideas?
User avatar
hdsentinel
Site Admin
Posts: 3128
Joined: 2008.07.27. 17:00
Location: Hungary
Contact:

Re: HDD Surface Test

Post by hdsentinel »

Yes, this is an interesting situation.

The Disk -> Extended test and Disk -> Surface test -> Read test are tests designed to reveal problems, show if the hard disk status is not stable and if there can be any problems.

Then the Disk -> Surface test -> Reinitialise disk surface test is used to repair these problems, completely fix the weak sectors and then improve the hard disk health and its usability in general.

In most cases, the weak sectors do not really indicate big problem with the hard disk itself. Weak sectors may be caused by different factors and most of them are independent from the actual hard disk, for example:

- drive failure (error with internal memory, problem with drive head or surface)
- power loss (the write operation was not finished because of power loss)
- power failure (weak power supply or not stable power line)
- data cable failure or improper connection
- system memory or motherboard problem
- overclocking


A new article about weak sectors and the tests, results and further information is under construction and will be released soon at Support -> Knowledge base -> Hard Disk Cases.
smallisland
Posts: 4
Joined: 2013.01.30. 21:00

Re: HDD Surface Test

Post by smallisland »

Thank you for responding but I am still confused by the result.

The hardware has been extremely stable and I just used a regular PSU tester as well as a break out board combined with a very high end FLUKE branded digital multimeter to test the power supply. The PSU is a Seasonic 650X model and all voltages were great even under load. My testing machine is connected to a pure sine wave UPS. The memory has never shown any issues even with extensive testing. The cabling is short and fairly new and the other disks I tested using the same connection did not have any issues.

Why would the reinitialise surface process in HD sentinel not have discovered and reallocated the weak sectors? Isn't that a more extensive test of the disk surface than the internal extended test in the drive? Since the extended test is an internal drive test that just reports information back to HDsentinel wouldn't the only issue be power or the disk controller? And if it were the disk controller why weren't the issues revealed during reinitialisation?

I ran another 12 hour reinitialise disk surface routine and the result stated:

Block
Good: 10000
Damaged: 0
Bad: 0

Hard disk test details
0 new reallocated sectors found
0 new spin retry errors found
95 pending sectors fixed
95 new off-line uncorrectable sectors found

The disk surface map was completely green again.

The last test had 0 for all items.

Disk health is back up to 100%!

Could you explain further? What do the results listed above tell me about the disk?

Thanks
User avatar
hdsentinel
Site Admin
Posts: 3128
Joined: 2008.07.27. 17:00
Location: Hungary
Contact:

Re: HDD Surface Test

Post by hdsentinel »

> Why would the reinitialise surface process in HD sentinel not have discovered and reallocated the weak sectors?

Because at that point there were no weak sectors.
If there would be any weak sectors, then the "reinitialise disk surface" test would not only reveal, but fix them immediately, exactly as now they are fixed.

Personally I completely agree that the situation is so interesting that after making sure that the whole disk surface is stable (as completed by the reinitialise disk surface test), the hardware self test detected some weak sectors.
In other cases I'd say the issue is related to working environment (power supply, cables, connections etc.) but as you wrote, these should be fine. Weird.

Personally I always recommend to start the hardware self tests first (Disk -> Short self test, Disk -> Extended self test) first as they can diagnose the most important components (heads, servo, cache, etc.) in general. After these tests (especially if they report problems, they stop with error and/or if the status changes) it is recommended to move further to Disk -> Surface test functions.

Maybe if possible I'd try to perform the complete testing (both Disk -> Extended self test, Disk -> Surface test -> Read test, Disk -> Surface test -> Reinitialise disk surface) in a different environment also, in a completely different computer, power supply, cables, controller and so. Just to verify the status (and possible issues) there.

> Isn't that a more extensive test of the disk surface than the internal extended test in the drive?
> Since the extended test is an internal drive test that just reports information back to HDsentinel wouldn't the only issue be power or the disk controller?
> And if it were the disk controller why weren't the issues revealed during reinitialisation?

The internal hardware extended self test and the software-based Reinitialisation works differently and performs different level of testing.
For example the hardware extended self test can check hard disk components in general (which would be not accessible in other ways for testing).
The Disk -> Surface test -> Reinitialise disk surface test forces the drive to test the sectors and reallocate / fix problematic areas if required.

If there would be issue with controller, it may be reported because the Disk -> Surface test -> Reinitialise disk surface test also verifies the written data and it would show if the data would be corrupted (for example because of a failed cache memory of the controller).


> The disk surface map was completely green again.
> The last test had 0 for all items.
> Disk health is back up to 100%!

This is exactly what we'd expect from the Disk -> Surface test -> Reinitialise disk surface test - to correct the problems (the Disk -> Surface test -> Read test could show these problems _before_ Disk -> Surface test -> Reinitialise disk surface test, when these problems could affect the stored data).

Personally I'd perform the complete testing again (as described above) before using the drive in actual environment.
(mabye I'm too careful - for personal storage, I usually test hard disk drives for 15-30 days by multiple passes of all tests in Hard Disk Sentinel :) )
Post Reply