Jump to content
  • 0

Recovery Plan after HDD failure in DrivePool used as unduplicated home media storage?


gtaus
 Share

Question

Background: DrivePool is used as my unduplicated home media storage server for Plex/Kodi. Original media files are backed up and stored on HDDs in the closet. No duplication is used on DrivePool for the media files. I recently lost a 5TB HDD in a pool of 17 drives, all of different brands, sizes, ages.... 

Problem: My disk monitoring software warned me about an imminent HDD failure. I immediately attempted to remove the disk from DrivePool, but all attempts to move data off that disk failed and the drive completely crashed in less than 12 hours. Now, like many before me, I am wondering what was actually on that failed drive and do I want to recover the lost pool files from my backups. 

Solutions considered:

SnapRAID might work for some people as adding a parity solution to the pool for recovery, but that means dedicating additional disks to the pool just to hold the parity data. Also, from what I understand, SnapRAID accesses drives via drive letters. With 17 drives in my pool, I have turned off the drive letters because DrivePool does not need them. It appears that there are ways to mount drives in empty folders so SnapRAID can bypass the Drive Letter requirement, but it looks to me that the SnapRAID solution gets more and more complicated as you add pool drives to DrivePool.

Duplication on DrivePool seems to be the easiest solution to recover from a HDD failure within DrivePool, but, who can afford to have a 1:1 solution on a 70TB+ pool? I already have backups of the data on HDDs stored in the closet. I just don't have the budget to both backup files in offline storage and then again duplicate the files in DrivePool.

WinCatalog 2020 is what I started to use just before my HDD failed. In theory, if you have DrivePool cataloged in WinCatalog 2020 and you suffer a HDD failure, you can run an update on the DrivePool volume and WinCatalog 2020 will report the "deleted" files compared to its last update. From what I understand, this still would require me to manually look up and recover lost files from my back ups in the closet. However, at the point, I may or may not decided to recover some or all of those files from my backups. Problem is, with a 5TB drive failure, that gets to be a lot of lost files. As I said, I just started using WinCatalog 2020, have not learned how to use it well yet, and and had not even gotten to the point of cataloging my DrivePool before I had a HDD failure in the pool. 

Solution I am searching for:

I would like some kind of cataloging program, like WinCatalog 2020, that I could constantly use to catalog my DrivePool automatically, like maybe every night. If I had a HDD failure in DrivePool, and lost an entire 5TB HDD of unduplicated media files, I would like the cataloging program to tell me which files were lost and prompt me to see which files I would want to recover from my storage backups in the closet. I could select the files I wanted to recover, and the program would tell me which storage HDD(s) I needed to get online (HDD caddy in my case) and then it would automatically restore those files to DrivePool from my original storage backups, prompting me to get each backup HDD as needed for the recovery task.

It would be great if DrivePool had such a cataloging feature built into it, but from what I have read on past threads in this forum, there does not seem to be much interest in working in that direction. Over the years my media collection has grown and now I currently have 70TB of files in DrivePool. As HDDs continue to increase in size, a loss of even one pool HDD means lots of unduplicated files are potentially lost to the pool. I don't expect DrivePool to do everything, but I can see HDDs at 12TB+ being more common and the loss of just one 12TB+ pool HDD could have a significant negative impact on unduplicated media files.

If anyone has addressed this issue and developed a working solution to recover from a HDD loss in an unduplicated media storage pool, please let me know. I have no desire to reinvent the wheel if a solution exists. Thanks for any comments.

Link to comment
Share on other sites

10 answers to this question

Recommended Posts

  • 0

UPDATE: I have figured out how to use the command line prompts with WinCatalog 2020 and run nightly updates on DrivePool using Windows Task Scheduler. That will update my virtual DrivePool J: volume every night, but it does not tell me what file is in which specific HDD. If I need to go into the detail level, I would have to turn on the Drive Letters of the pool drives and also update them every night. I don't think WinCatalog 2020 can scan a drive without a Drive Letter assigned, but I'm still looking into that possibility.

Assuming I might only lose one HDD, at least running an update report on WinCatalog 2020 on the entire DrivePool J: volume would tell me which files had been deleted from the HDD failure. That gets me closer to a recovery plan.

Link to comment
Share on other sites

  • 0
On 1/2/2021 at 9:18 AM, gtaus said:

I don't expect DrivePool to do everything, but I can see HDDs at 12TB+ being more common and the loss of just one 12TB+ pool HDD could have a significant negative impact on unduplicated media files.

I think at such a point you should perhaps be re-examining your entire backup methodology to see if there's a better alternative. E.g. "Okay, if disk Q dies, it has five hundred, eighty one thousand, seven hundred and forty two files on it. Do I really want to have to go through a list to see which particular ones I'd like to recover from drives X/Y/Z in my closet, or do I want to be able to just tell the computer to restore all the missing files so I can get on with other stuff?"

On 1/2/2021 at 9:18 AM, gtaus said:

I would like some kind of cataloging program, like WinCatalog 2020, that I could constantly use to catalog my DrivePool automatically, like maybe every night. If I had a HDD failure in DrivePool, and lost an entire 5TB HDD of unduplicated media files, I would like the cataloging program to tell me which files were lost and prompt me to see which files I would want to recover from my storage backups in the closet. I could select the files I wanted to recover, and the program would tell me which storage HDD(s) I needed to get online (HDD caddy in my case) and then it would automatically restore those files to DrivePool from my original storage backups, prompting me to get each backup HDD as needed for the recovery task.

That sounds somewhat like the old enterprise tape backup software I used decades (blergh) ago. I could tell it something like "I want to restore all missing files between 01-01-87 through 31-03-1987" and it'd go "insert tape XYZ123... insert tape XYZ124... insert tape XYZ125... insert tape XYZ126..."

5 hours ago, gtaus said:

Assuming I might only lose one HDD, at least running an update report on WinCatalog 2020 on the entire DrivePool J: volume would tell me which files had been deleted from the HDD failure. That gets me closer to a recovery plan.

I seem to vaguely recall another poster on this forum some time ago also mentioning that they had a closet of backup drives, and they used the simple technique of buying drives in pairs (pool and backup) and having the ordered file placement as their only balancer; when a drive filled up they'd copy it to the backup and put the latter in the closet, and if a pool drive died they'd just pull the matching drive out of the closet (and buy a replacement). Not sure I'd trust that, but if it does the job then it does the job.

My own onsite setup, FWIW, is desktops and laptops -> backup to main server -> read-only share to backup server (using FreeFileSync configured for 30-day versioning). The backup server has one job: backups. It does nothing else, and any program I install on has been vetted on a different computer first. Now that we have decent internet here and the budget's not so tight, I'm looking at options for automating my offsite backups too.

Link to comment
Share on other sites

  • 0
9 hours ago, Shane said:

I think at such a point you should perhaps be re-examining your entire backup methodology to see if there's a better alternative.

I agree. I see shortcomings in my current backup methodology and as future HDDs increase in size, my current method of backups is totally inadequate.

9 hours ago, Shane said:

I seem to vaguely recall another poster on this forum some time ago also mentioning that they had a closet of backup drives, and they used the simple technique of buying drives in pairs (pool and backup) and having the ordered file placement as their only balancer; when a drive filled up they'd copy it to the backup and put the latter in the closet, and if a pool drive died they'd just pull the matching drive out of the closet (and buy a replacement). Not sure I'd trust that, but if it does the job then it does the job.

That would certainly be better than my current backup system. I have been rethinking my DrivePool balancing and have been considering moving to the ordered file placement. For some reason, I thought DrivePool would move an entire folder to one drive, but what I find is that files from a folder could be written to a number of different drives. I guess that makes certain sense from a balancing point of view, but it makes it harder to recover from a loss.

9 hours ago, Shane said:

My own onsite setup, FWIW, is desktops and laptops -> backup to main server -> read-only share to backup server (using FreeFileSync configured for 30-day versioning). The backup server has one job: backups. It does nothing else,

OK. I think I understand your backup strategy with FreeFileSync. What advantage do you get from having a second server with a 1:1 backup via FreeFileSync that you could not get by just setting your DrivePool to 2X duplication from the start? Do you keep your second server online all the time, or do you only turn it on for your 30 day updates?

10 hours ago, Shane said:

That sounds somewhat like the old enterprise tape backup software I used decades (blergh) ago. I could tell it something like "I want to restore all missing files between 01-01-87 through 31-03-1987" and it'd go "insert tape XYZ123... insert tape XYZ124... insert tape XYZ125... insert tape XYZ126..."

Yeah, that's what I grew up with too. Those backup tapes kept you busy, but if you had to actually recover from a loss, it was almost hit or miss and success was not guaranteed.

Fortunately, I have all my "important" financial and personal files duplicated 2X or 3X in DrivePool and on the cloud, and my original media files are stored on HDDs in the closet. So, I have not lost anything per se. It's just that this HDD failure has shown me I need a better way to plan for future problems.

As always, thank you @Shane for your helpful responses.

Link to comment
Share on other sites

  • 0
1 hour ago, gtaus said:

That would certainly be better than my current backup system. I have been rethinking my DrivePool balancing and have been considering moving to the ordered file placement. For some reason, I thought DrivePool would move an entire folder to one drive, but what I find is that files from a folder could be written to a number of different drives. I guess that makes certain sense from a balancing point of view, but it makes it harder to recover from a loss.

Yeah, for keeping particular files and folder on particular drive(s) you need to use the File Placement rules (GUI: Manage Pool -> Balancing -> File Placement).

1 hour ago, gtaus said:

OK. I think I understand your backup strategy with FreeFileSync. What advantage do you get from having a second server with a 1:1 backup via FreeFileSync that you could not get by just setting your DrivePool to 2X duplication from the start? Do you keep your second server online all the time, or do you only turn it on for your 30 day updates?

The advantage is it's backup not duplication. Different measures for different risks. I have it on all the time so it can back up the main pool regularly, the versioning means it keeps the last 30 days of changes so if anyone needs to revert a file to an earlier version (e.g. because they accidentally deleted or saved over the wrong document) and doing it via a read-only share of the main server makes it harder for malware to reach the backups (because plugging a backup drive in directly means risking malware having direct access to those backups). Truly critical files get further backed up to external portable drives rotated offsite.

TLDR: Duplication for drive failure + Backup for human failure. B)

Link to comment
Share on other sites

  • 0
13 hours ago, Shane said:

for keeping particular files and folder on particular drive(s) you need to use the File Placement rules (GUI: Manage Pool -> Balancing -> File Placement).

Right, I was re-reading the File Placement rules for folders in the DrivePool online manual, and I think might work better for me if I go that route.

 

13 hours ago, Shane said:

The advantage is it's backup not duplication. Different measures for different risks. I have it on all the time so it can back up the main pool regularly, the versioning means it keeps the last 30 days of changes so if anyone needs to revert a file to an earlier version

Thanks for explaining that to me. I looked up FreeFileSync after you mentioned it, but I did not realize that it had the ability for versioning as you explained. I can see where that would really be a nice feature for some uses of DrivePool. In my case, DrivePool is primarily just my media server. Also, I am not yet at a point where I want to build another 70TB+ pool to backup my current 70TB DrivePool. I could see FreeFileSync as a backup solution for the small part of my DrivePool that is not media files. I can now see what you mean by using DrivePool's duplication in case of drive failure, but using FreeFileSync as a backup solution for human error. If I accidently deleted a duplicated file in DrivePool, I would be short on luck trying to get it back without some kind of versioning option like that in FreeFileSync. I'll be looking at that more closely. Thanks.

 

Link to comment
Share on other sites

  • 0

I'm new at all this and far far far from an expert. But I setup drivepool with snapraid. It doesn't require drive letters. In your config file you just put the path to the "PoolPart" files that snapraid creates. I have 3x12TB drives for data and 1x12TB drive for parity and the only drive letter I see is what drivepool created. Mine is setup like bitlocker > drivepool > drives mounted in a folder in the C drive > snapraid. Found this forum for the first time tonight and, thanks to posts by both you and Shane, I've decided to up the deduplication on one of my folders in drivepool to 2x just in case as I don't want to lose personal photos. I also have everything backed up online but want to leave that as a last resort. Next time I add drives it'll be 2x12TB with one more for data and one more for parity.

 

I spent a bit of time reading posts online before setting mine up, mostly on reddit, and -- after a few months so far -- so far so good. Example post: https://www.reddit.com/r/DataHoarder/comments/7gb7my/looking_for_advice_in_setting_up/

 

There's a bunch of sync/scrub scripts available to work off of online too, so the whole setup didn't require much brainpower on my end thankfully. 

 

That said, if anyone thinks this is a terrible idea I'm all ears. Like I said, far from an expert, and know just enough to get myself into trouble.

 

Edit to add: this is the nightly script I have running with some minor tweaks: https://zackreed.me/snapraid-split-parity-sync-script/

Edited by beepboop43
Update w/ additional info
Link to comment
Share on other sites

  • 0
2 hours ago, beepboop43 said:

I setup drivepool with snapraid. It doesn't require drive letters. In your config file you just put the path to the "PoolPart" files that snapraid creates.

I watched a 3-year old YouTube video on DrivePool + SnapRAID, but there he was using drive letters for SnapRAID. I know DrivePool creates PoolPart directories, does SnapRAID create its own separate PoolPart?

In SnapRAID, what is the ratio between data drives and parity drives? For example, in Windows Storage Spaces, it would take 1 parity drive to 2 data drives of the same size. In other words, 33% overhead loss for parity. That is, of course, better than 100% overhead for a mirrored array. One big advantage of DrivePool is that you can just assign certain important folders for duplication. In my case, that amounts to less than 5% overhead.

Have you ever had to recover from a loss of a pool HDD using SnapRAID? From the YouTube video, it looked like a very long rebuild time for his small pool. My DrivePool is currently 70TB and I am wondering if it would be faster to rebuild from SnapRAID or my backup HDDs stored on a shelf in my closet.

 

2 hours ago, beepboop43 said:

I've decided to up the deduplication on one of my folders in drivepool to 2x just in case as I don't want to lose personal photos. I also have everything backed up online but want to leave that as a last resort.

Same here. I have 2X duplication on some financial and personal data folders that I would hope to rebuild directly in DrivePool. But I also keep backups of those (relatively) small folders on a Cloud backup. Most of my DrivePool is just a 1X home media storage, but I can rebuild lost files from my offline HDD backups.

2 hours ago, beepboop43 said:

...know just enough to get myself into trouble

With the large data HDDs we use these days, a loss of a single HDD can get you into trouble fast. In theory, SMART should give us some warning before a HDD fails, but in my experience, 80% of my drives have a sudden death with no real warning. So I am trying to develop a recovery plan that would work for me for then next time I have a pool HDD failure. Still considering all options.

Link to comment
Share on other sites

  • 0
18 hours ago, gtaus said:

I watched a 3-year old YouTube video on DrivePool + SnapRAID, but there he was using drive letters for SnapRAID. I know DrivePool creates PoolPart directories, does SnapRAID create its own separate PoolPart?

No, I just tell snapraid that the data location is inside the PoolPart directory. For example, here's what my config looks like:

# Defines the data disks to use
# The name and mount point association is relevant for parity, do not change it
# WARNING: Adding here your boot C:\ disk is NOT a good idea!
# SnapRAID is better suited for files that rarely changes!
# Format: "data DISK_NAME DISK_MOUNT_POINT"
data d1 C:\mounts\HD_MODEL - HD_SN\PoolPart.99266457-9bf0-46da-8285-163b8b5fe9f8\
data d2 C:\mounts\HD_MODEL - HD_SN\PoolPart.9886f8bd-626d-4d6b-96c5-5f7633461856\
data d3 C:\mounts\HD_MODEL - HD_SN\PoolPart.9a0f11e0-8e4d-4022-a338-c814de2a4f87\

I chose to name my mounting points by the drive's model and then serial number so I can tell which is which. For my parity drive I changed the folder slightly to 'HD_MODEL - HD_SN - Parity' so, from a glance in that folder, I know which one is the parity drive.

 

18 hours ago, gtaus said:

In SnapRAID, what is the ratio between data drives and parity drives?

https://www.snapraid.it/faq#howmanypar -- "As a rule of thumb you can stay with one parity disk (RAID5) with up to four data disks, and then using one parity disk for each group of seven data disks". I have three data drives and one parity so in theory I could add another data drive to the mix before adding an additional parity, but when I look at the parity file on the parity drive, it's already using 10TB of the 12TB so I think next time I add it'll be at least two drives so one of them can be an additional parity drive.

Also - apparently in older versions of snapraid you had to use your largest drive for parity so you may see that referenced on older posts. That's no longer the case though I don't understand it well enough to explain it further.  

 

18 hours ago, gtaus said:

Have you ever had to recover from a loss of a pool HDD using SnapRAID?

No, but I need to look into "safe" ways of testing this. I know a backup that hasn't been validated via restoration isn't a backup. I have 12TB drives so I'm imaging it will be very time consuming. Seems to me, based solely on forum posts I've read, most users trust snapraid's parity to cover them in a drive failure. Haven't verified so can't speak from experience, but that's also why I changed one folder to 2x duplication in drivepool. 

 

Assuming this is all working like I think/hope it's working, I was really surprised at how easy it was to set all of this up and am super happy with drivepool and scanner. I also have clouddrive but haven't set that up yet.

Link to comment
Share on other sites

  • 0
8 hours ago, beepboop43 said:

Seems to me, based solely on forum posts I've read, most users trust snapraid's parity to cover them in a drive failure. Haven't verified so can't speak from experience, but that's also why I changed one folder to 2x duplication in drivepool. 

I have become really cynical on any system that has not been verified after failure(s). Like I said, my Windows Storage Spaces had 26 HDDs in the pool and was setup to recover from a 2 drive failure. I had 1 small HDD fail and it took down my entire Storage Spaces. Since Storage Spaces uses striping, all data was lost. That was the third time Storage Spaces let me down in that respect, and then I moved to DrivePool. I have had much better results with DrivePool.

Yes, I have suffered 2 HDDs failures (but not at the same time) in DrivePool. The first time I was able to offload all but 2 corrupt files from a 5TB drive. The second drive failure resulted from me following MS Windows' recommendation to run a chkdsk on a failing drive - that resulted in a corrupted directory on that 5TB drive and I lost all access to the data on that drive. If I would have just offloaded the data from the failing drive, I would have probably saved almost all my data on that drive too. But for some reason, chkdsk corrupted the directory and it was game over.

I only have a few folders set to 2X duplication, but the files were indeed safe after my HDD failures. I don't quite understand the way DrivePool knows where the good copy of the file is, but the system seems to work. All I know is that 2X duplication means DrivePool will write the file(s) on 2 separate drives, ditto for 3X on 2 different drives, etc...

 

8 hours ago, beepboop43 said:

am super happy with drivepool and scanner

The more I use DrivePool and understand how it works, the more I like it. I decided to get Hard Disk Sentinel instead of Stablebit Scanner because Hard Disk Sentinel has HDD repair features in the paid version. I figured if I can "fix" one HDD with that software, then it more than paid for itself. I am happy to state that I had 7 failed HDDs in a box that had failed over the years in Storage Spaces, but I was able to "fix" 2 of the 3TB drives with Hard Disk Sentinel and put them back into service. So the program has already paid for itself.

My latest 5TB HDD failure is sitting in my USB caddy and Hard Disk Sentinel is running a "Reinitialize disk surface" program on it to see if it can renew the drive. It is a very in depth testing program, testing every single block, and is currently estimated completion time of 50 hours, but I have already been running it for the past 47 hours. So maybe later in the morning I will get a final report if the drive can be put back into service. So far, all blocks have passed the test, but the graph does show some portions of the drive are slower than others. I think Hard Disk Sentinel "magically" tells the drive not to use weak or failing sectors and that is how it is able to renew some of the drives (but not all).

9 hours ago, beepboop43 said:

Also - apparently in older versions of snapraid you had to use your largest drive for parity so you may see that referenced on older posts. That's no longer the case though I don't understand it well enough to explain it further. 

The YouTube video I watched did mention that the parity drive had to be the largest drive. I guess that would not be a problem. However, if you only need 1 parity disk to cover the first 4 data disks, and then only 1 more disk for the next 7 data disks, I may look into SnapRAID some more. I currently have 17 disks in my pool, so I guess that would require 3 parity disks, right? I was thinking back to Storage Spaces where I would need 9 parity disks. I don't want to do that, but DrivePool with SnapRaid (using only 3 parity disks) might be a good option. Thanks.  

Link to comment
Share on other sites

  • 0

Note that the number of parity disks is a statistical risk thing - 1 parity disk can recover from 1 other disk failure, 2 from 2, etc. The risk of more disks failing increases the more disks you have but if you're happy with the risk, you don't have to have multiple parity disks.

 

The parity will always be equal in size to the largest data disk you're protecting - so while it doesn't have to be the biggest disk you have, it can't be smaller.

 

Note that you can have multiple different SnapRAID sets configured at once with different numbers of parity etc if you want to get complicated about it - a similar concept to having some more valuable things duplicated more than other less important things in DrivePool I suppose.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

×
×
  • Create New...