Jump to content
  • 0

MISMATCH Checksum Hash Values with DrivePool


HTWingNut
 Share

Question

After much angst, I realized that READ STRIPING can affect the calculated checksum hash value of a file. Turning off read striping enables consistent hash checksum values.

IS THERE ANY WAY TO DISABLE AND ENABLE READ STRIPING FROM COMMAND LINE?

I run a batch script for my backups every evening and recently added a hash check. It looks like I need to disable read striping otherwise hash checksum values are affected.

If  the read striping could be toggled from a command line, it could be turned off before backup and back on after backup. Otherwise I have to leave read-striping disabled if I want to use any kind of integrity check.

 

Link to comment
Share on other sites

4 answers to this question

Recommended Posts

  • 0

Ok, I will open a ticket.

Problem is it's not consistent and it's somewhat difficult to reproduce. It seems to happen more often with large files. I ended up with a few dozen files while getting checksum of over a couple hundred thousand files. This occurred with sha1 md5 blake2.

The strange thing is that if I get an SHA1 hash of a file on a known good drive (not in DrivePool), for simplicity say checksum equals "123456ab". If I copy that file to DrivePool and with read-striping off, hash equals "123456ab".  If I enable read striping and this anomaly occurs hash is now say "412453bf". But if I copy that file even with read-striping on to another external drive, the hash is equal to "123456ab". So it's not like the file changes. After turning off read-striping, I ran another checksum of all the files and everything was fine.

Link to comment
Share on other sites

  • 0

Hi everyone.

I wanted to pick up on that topic since I have experienced the same issue. While read striping is activated and for a lot of files are continuously checked the hash value is checked or calculated, errors will occur. Checking those single files marked and corrupt the hashes are identical. File sizes differ and start from around 3MB. Turning the read striping of the hashes are all correct. (I scanned through roughly 50000 files and the errors are more or less random on.)

I think that there is an issue with the block based algorithm, because I have checked all copies of the file on the duplication discs and they were identical.

I have used the program "checksum" (corz.org) to create and verify the hashes.

Link to comment
Share on other sites

  • 0

Ok, all of this actually reminds me a lot to an issue that I already reported few months ago (5570628). Sorry, Christopher, I don't want to be a troublemaker but I think it's a severe issue and it seems that it is totally related to the one I reported, so I have to repeat this here.

All the symptoms that are here mentioned are shared with my issue:

  • Wrong hash, regardless of which hash algorithm
  • Only happens with read-striping enabled
  • "it's not consistent and it's somewhat difficult to reproduce"
  • "It seems to happen more often with large files"
  • Checking files again after they had erroneous hashes makes them actually appear fine, on second try

I'm using a custom-written Python script for applying the hash algorithms, so I already did a lot of research on the details of when this occurs.

When does it occur?

  • It only occurs when all of the following comes together, though in this case I am reliably able to recreate it:
  • It only occurs when read-striping is enabled. It never occurs when read-striping is disabled.
  • It only occurs when reading data via the drive pool letter. Directly accessing the data on the underlying disks is always fine.
  • It only occurs when my Antivirus software's on-access file scanning is enabled on the pool drive letter. All other options of the software (Bitdefender) seem unrelated. Also on-access scanning of the raw disks' letters/mount paths will not lead to the problem.
  • It only happens if the file is not in Windows read cache (so the first read after cache flush or after a reboot, that is).
  • It only happens to larger files (my guess: >128 MiB?).
  • Which exact file is affected appears to be somewhat random.

What is it doing to the data?

  • The error is not affecting the actual data on disk, but "just" delivering wrong data on (first, uncached) read.
  • The abnormal hash value (would be "412453bf" in WingNut's example) is always the same; that means: if a file is affected, it is always changed in the same way.
  • I also tried to capture the corrupted data during hashing to see how it is actually changing and found out the following:
    The read file content is correct until offset 0x8000000 into the file; at this offset of the read data, the file content starts again from the beginning, like if the read pointer in the actual file was reset to 0 at that time and then the file was read "again" from start until eof. This means that the wrongly read data is exactly 0x8000000 = 128 MiB larger than it should be, with the excess part being a duplicate of the beginning (128 MiB) of the file itself. Though I only did this check once, so I cannot tell if this is always the same; anyway, the 128 MiB seems quite suspicious.

So, in conclusion, for me it seems that the combination of large file + read-striping + on access virus scanner is sometimes delivering wrong data on first read. And this makes the hash function deliver a wrong value. What's so scary about this issue, is that it's actually a totally silent data corruption that could happen on any read of a file, not just by the hashing software (in fact, I used the normal Python library functions for reading files, same that every other Python program uses; ..which uses OS functions that probably every other software in any programming language uses). This even could mean that in theory the corruption could be made "permanent" if you move a file from the pool to somewhere else and the issue occurs on that read process. For me, this was reason enough to completely disable read-striping until this issue is fixed.

Please, kuba4you and HTWingNut, feel free to double check for similarities with your issues! And may you kindly tell if and which antivirus software are you using?

I hope by putting information together, we will find a solution for this. If it's not the same issue, then I'm sorry for bloating this thread.

Edit: I just realized that the original question by WingNut was posted last year. Anyway, as the issue is still present and kuba4you revived that thread, I guess having the information here is still a good idea.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

×
×
  • Create New...