Jump to content
  • 0

DrivePool file corruption


mhendu

Question

I have DrivePool installed on two computers - one where I've had it installed for quite a while with no issues, and another where I just finally migrated from FlexRaid pooling. I have the SSD Optimizer plugin installed but am using it to cache files on a regular 3 TB hard disk before moving them to my array (the other drives, but not this disk, are still set up with FlexRaid transparent RAID to provide some protection via parity).

Although the speed of this pool is much quicker than with FlexRaid, I'm running into some disturbing file corruption issues. I've compressed a few movies with StaxRip and the resulting file, when placed on the pool, will have corrupt sections (note this is on the 3TB cache drive that is not set up with FlexRaid). Some of the movies worked fine when muxed to the cache drive, but then when file balancing runs they'll get corrupted when they get moved to a different drive in the array. This makes the product unusable.

Any help would be appreciated.

 

Link to comment
Share on other sites

16 answers to this question

Recommended Posts

  • 0

I have not experienced your specific problems, but I have a few things I might try.

You might want to turn on the verify after copy option. Goto Settings Cog>Troublshooting>Verify After Copy. This feature is normally turned off because it will slow down your system. However, in your case with suspected bad copies, it might be worth it to check out that feature.

I use the free program MultiPar to create .par2 files for file verification and rebuild. In your case, you would tag the movie file, or folder, and create .par2 files. When you run MultiPar on the .par2 file, it will verify if the original file is intact or if it has been damaged. If damaged, it will attempt to repair the file(s) with the .par2 files you created. You can adjust the % of .par2 files for rebuild anywhere from 0% (index only) to 100% complete blocks for total rebuild. I have my MultiPar set to 10% .par2 files which is enough to verify the file(s) in the folder and will rebuild some damaged and lost files.

Of course, I would recommend a good offline backup plan. If your files become corrupt, and MultiPar cannot rebuild the damaged files with the .par2 files you created, then you can pull the backup files and reload them.

I really don't understand why DrivePool would be corrupting your files when it balances the pool, but maybe some of these ideas will help you narrow down the cause of your problem.

Link to comment
Share on other sites

  • 0
5 hours ago, gtaus said:

I have not experienced your specific problems, but I have a few things I might try.

You might want to turn on the verify after copy option. Goto Settings Cog>Troublshooting>Verify After Copy. This feature is normally turned off because it will slow down your system. However, in your case with suspected bad copies, it might be worth it to check out that feature.

I use the free program MultiPar to create .par2 files for file verification and rebuild. In your case, you would tag the movie file, or folder, and create .par2 files. When you run MultiPar on the .par2 file, it will verify if the original file is intact or if it has been damaged. If damaged, it will attempt to repair the file(s) with the .par2 files you created. You can adjust the % of .par2 files for rebuild anywhere from 0% (index only) to 100% complete blocks for total rebuild. I have my MultiPar set to 10% .par2 files which is enough to verify the file(s) in the folder and will rebuild some damaged and lost files.

Of course, I would recommend a good offline backup plan. If your files become corrupt, and MultiPar cannot rebuild the damaged files with the .par2 files you created, then you can pull the backup files and reload them.

I really don't understand why DrivePool would be corrupting your files when it balances the pool, but maybe some of these ideas will help you narrow down the cause of your problem.

Oh thanks - hadn't seen that option. Not sure why I'm running into this with DrivePool and not with FlexRaid pooling. After restarting my computer it largely seems to have stopped creating errors in files copied to my cache drive, but I'm still getting errors once the files are moved off the cache drive onto my array.

How does Verify After Copy work? Will it try to copy again if there's an error? Just keep the file on the original drive and notify you that the balancing failed? Or move the file, identify an error, tell you there's an error but delete the original?

Link to comment
Share on other sites

  • 0
On 7/27/2021 at 2:29 PM, mhendu said:

Oh thanks - hadn't seen that option. Not sure why I'm running into this with DrivePool and not with FlexRaid pooling. After restarting my computer it largely seems to have stopped creating errors in files copied to my cache drive, but I'm still getting errors once the files are moved off the cache drive onto my array.

How does Verify After Copy work? Will it try to copy again if there's an error? Just keep the file on the original drive and notify you that the balancing failed? Or move the file, identify an error, tell you there's an error but delete the original?

I have never experienced your problem of files getting corrupted when moved off the cache drive to the archive drives in the pool. I have never used the DrivePool Verify After Copy feature because I have not had the issues you are reporting. I don't know how DrivePool would respond if a verification failed, but I suspect it will alert you and will probably leave the file on the cache drive and try again the next time it balances. But, I really don't know as I have never used that feature.

I once had a problem with a USB 3.0 HDD caddy and transferring files into/from DrivePool. Files were getting corrupted in that transfer and I solved that issue by plugging the caddy into a USB 2.0 port and slowing all the transfers down. For some reason, my corrupt file transfer issues went away with USB 2.0. Maybe the caddy had a buggy USB 3.0 engine? 

I would still encourage you to use something like MultiPar with verification files to narrow down your issue. If you had a test folder(s) with files and .par2 files, you could verify the folders first and then transfer those files to your DrivePool cache. You could immediately run a verification on the files in cache to see if they transferred without corruption into DrivePool. Then after the cache flushes to the archive drives, run the verification on the folder again using the .par2 files. In that way, you could verify the files on your initial device or computer, in the DrivePool cache, and in DrivePool archives.

Also, you might consider trying a different interface cable on your archive drive to see if that makes a difference. Cables can go buggy and cause intermittent corruption. Just enough to drive you crazy because you probably can't verify it was a bad cable unless a different cable solves your issue.

Good luck. I hope you find the solution to your corrupted files issue. 

Link to comment
Share on other sites

  • 0
On 7/26/2021 at 12:22 AM, mhendu said:

I have DrivePool installed on two computers - one where I've had it installed for quite a while with no issues, and another where I just finally migrated from FlexRaid pooling. I have the SSD Optimizer plugin installed but am using it to cache files on a regular 3 TB hard disk before moving them to my array (the other drives, but not this disk, are still set up with FlexRaid transparent RAID to provide some protection via parity).

Although the speed of this pool is much quicker than with FlexRaid, I'm running into some disturbing file corruption issues. I've compressed a few movies with StaxRip and the resulting file, when placed on the pool, will have corrupt sections (note this is on the 3TB cache drive that is not set up with FlexRaid). Some of the movies worked fine when muxed to the cache drive, but then when file balancing runs they'll get corrupted when they get moved to a different drive in the array. This makes the product unusable.

Any help would be appreciated.

 

Did you ever figure this out? I’m having the same problem. Was it the SSD optimizer, read striping, the cables, bypass system filters, real time duplication, or something else? I’ve set pretty much everything to default, and I’m trying one by one to see what’s doing it, but it is sporadic so it’s tough. 

Link to comment
Share on other sites

  • 0
37 minutes ago, lava890 said:

Did you ever figure this out? I’m having the same problem. Was it the SSD optimizer, read striping, the cables, bypass system filters, real time duplication, or something else? I’ve set pretty much everything to default, and I’m trying one by one to see what’s doing it, but it is sporadic so it’s tough. 

I ended up just moving to SnapRAID and this seems to have resolved the issue. The downside is there's no real-time parity, but I do like that I can configure it to scrub files routinely to detect corruption, which FlexRAID did not do.

Link to comment
Share on other sites

  • 0
36 minutes ago, mhendu said:

I ended up just moving to SnapRAID and this seems to have resolved the issue. The downside is there's no real-time parity, but I do like that I can configure it to scrub files routinely to detect corruption, which FlexRAID did not do.

Ok, thanks for responding. I’ll experiment a bit more before I give up on drivepool. I think it worked flawlessly until I added the SSDs a couple years ago. 

Link to comment
Share on other sites

  • 0

Does the problem still appear if you disable Read Striping (Manage Pool -> Performance -> Read striping)?
And do you have an anti virus software active that checks files on access?

I have an issue with random file corruption when using the combination of both (Bitdefender + Read Striping), which is why I disabled read striping.

Link to comment
Share on other sites

  • 0
8 hours ago, Jonibhoni said:

Does the problem still appear if you disable Read Striping (Manage Pool -> Performance -> Read striping)?
And do you have an anti virus software active that checks files on access?

I have an issue with random file corruption when using the combination of both (Bitdefender + Read Striping), which is why I disabled read striping.

When I turned off read striping, it solved the problem. I just use the windows antivirus which you can’t really shut off permanently. 

Link to comment
Share on other sites

  • 0
23 hours ago, lava890 said:

When I turned off read striping, it solved the problem. I just use the windows antivirus

Good to hear, for your case. :) When I measured it, read striping didn't give me any practical performance increase anyway, so it's not so much a problem to turn it off.

But that the file corruption problem for you appears with Windows Antivirus, too, is actually a new level of that problem! :o I think you should file a bug report to the support, mentioning this! I mean, silent file corruption is somewhat a nightmare for people using what not technology to have their files stored savely. I had extensively documented the kind of corruption (what conditions, which files, even the kind of corruption in the file format) that happened for me back then, but the StableBit guys somewhat refused to do anything because they saw it as a problem on Bitdefender side. Which is partly disappointing but also partly understandable, because Bitdefender probably really does some borderline stuff. But if now it happens with the regular Windows Antivirus ... they should take it serious! I mean DrivePool + Read Striping + Windows Defender is not really an exotic combination, it's pretty much the most ordinary one. :mellow:

It's bits here at stake, StableBit! ^_^  @Christopher (Drashna)

Link to comment
Share on other sites

  • 0
Just now, lava890 said:

Everything works fine with just the windows antivirus running.

Well it also worked fine for me with either Bitdefender or read striping running, but not with both in combination, so you cannot really say that either of them causes it alone. It seemed to be the combination of read striping reading files in stripes and antivirus checking them on-the-fly. Specifically the curruption that was taking place was a duplication of exactly the first 128 MB of the file, which seems very coincidental and un-random. I would suspect that either Bitdefender scans files in chunks of 128 MB or DrivePool stripes them in 128 MB chunks. And somewhere on the clash of on-access-scanning of Bitdefender (and Windows Defender?) and read striping in the CoveFS driver (and maybe Windows file caching playing along, too), something unexpected happens.

Link to comment
Share on other sites

  • 0

Yes, please open a ticket at https://stablebit.com/Contact

And ideally, attach drive tracing logs of this in action:
https://wiki.covecube.com/StableBit_DrivePool_2.x_Log_Collection

 

Also, it's very weird that Windows Defender would do this. as we do test with it. (well, it's hard not to).  But issues with BitDefender... don't really surprise me. 

Also, in the meanwhile, it may be worth enabling the "bypass file system filters" option, as this should prevent the realtime scanner from being used when the pool reads the data from the underlying drives.

Link to comment
Share on other sites

  • 0

Wish I had found this thread earlier, have been chasing a random corruption of movie files for over a year. It is so random and such a small percentage of files that I may not catch it for months till I am playing back a video file and find it corrupt at a random place, not the first 128MB though. Issue followed from Windows 2019 to 2022 server and a compete rebuild to newer hardware. Originally I was using SSD cache but moved to 384GB RAM with no change. Windows Defender is in use but drive pool is excluded from scans. Drive Pool is duplicating everything but read striping is disabled and verify after copy enabled now, hopefully that fixes it.

Link to comment
Share on other sites

  • 0
9 hours ago, DigitalPackrat said:

Wish I had found this thread earlier, have been chasing a random corruption of movie files for over a year. It is so random and such a small percentage of files that I may not catch it for months till I am playing back a video file and find it corrupt at a random place, not the first 128MB though. Issue followed from Windows 2019 to 2022 server and a compete rebuild to newer hardware. Originally I was using SSD cache but moved to 384GB RAM with no change. Windows Defender is in use but drive pool is excluded from scans. Drive Pool is duplicating everything but read striping is disabled and verify after copy enabled now, hopefully that fixes it.

Hope it works

Link to comment
Share on other sites

  • 0
19 hours ago, Christopher (Drashna) said:

Yes, please open a ticket at https://stablebit.com/Contact

And ideally, attach drive tracing logs of this in action:
https://wiki.covecube.com/StableBit_DrivePool_2.x_Log_Collection

 

Also, it's very weird that Windows Defender would do this. as we do test with it. (well, it's hard not to).  But issues with BitDefender... don't really surprise me. 

Also, in the meanwhile, it may be worth enabling the "bypass file system filters" option, as this should prevent the realtime scanner from being used when the pool reads the data from the underlying drives.

It was sporadic, so it would have to be logging for several days to probably catch the problem, and it was pretty disruptive so I'd rather not turn the read striping back on since it's been working well now since July 30.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...