Jump to content
  • 0

File Pool Duplication


WingedAngel

Question

I have a quick question about the file pool duplication process used in drive pool.

I currently have 3x6TB Drives and I have 2x file duplication on. If for instance two of my drives suddenly failed would the last drive have all of the information or would that only happen if I have the 3x file duplication on?

I don't know if the number of file copies mean that Drivepool has it dispersed over all those drives or just has three copies in random spots per drive so am curious what would be best to protect me from the above situation though having Stablebit scanner does help.

Link to comment
Share on other sites

5 answers to this question

Recommended Posts

  • 1
On 5/26/2023 at 3:56 AM, Shane said:

Yes, to completely prevent any chance of data loss from 2 drives suddenly failing at the same time you'd need 3 times duplication. Note that scanner doesn't protect against sudden failures; that's why they're called "sudden". Scanner protects against the type of failures that'll take longer to kill your drive than you/drivepool will take to rescue any data you want off it.

Basically there are what I'd consider to be four types of drive failure:

  1. Sudden - bang, it's dead. This is what things like Duplication and RAID are meant to protect against.
  2. Imminent - you get some warning it's coming. Duplication and RAID also protect against this type, and Scanner tries to give you enough time for pre-emptive action.
  3. Eventual - no drive lasts forever. Scanner helps with this too by notifying you of a drive's age, duty cycle, etc, so you can budget ahead of time for new ones.
  4. Subtle - the worst but thankfully rarest kind, instead of the drive dying it starts corrupting your data. Scanner can sometimes give clues, otherwise you need some method of being able to detect/repair it (e.g. some RAID types, SnapRAID, PAR2, etc) or at least having intact backups elsewhere. DrivePool might help here, depending on whether you notice the corruption before it gets to the other duplicate(s).

If it helps any, I suggest following the old 3-2-1 rule of backup best practice, which means having at least three copies (production and two backups), at least two different types (back then it was disk and tape, today might be local and cloud) and at least one of those backups being kept offsite, or some variant of that rule suitable for your situation.

For example, my setup:

  • DrivePool with 2x duplication (3x for the most important folders) to protect against sudden mechanical drive failure on the home server.
  • Pool is network-shared; a dedicated backup PC on the LAN takes regular snapshots to protect against ransomware and for quick restores.
  • Pool is also backed up to a cloud provider to protect against environmental failures (e.g. fire, flood, earthquake, theft).

Appreciate the confirmation! I'm in the process of setting up 3-2-1 system with this being the first step then I'll be using cloud drive and have another drive ready to be loaded with data. I'll probably throw my drive on 3x dupli overnight since it'll probably take a while for it to update.

Link to comment
Share on other sites

  • 0

If you have X times duplication then DrivePool will try to keep each file on X number of drives. The default behaviour when saving a file to the pool is for it to be put on whichever drive(s) have the most free space at the time.

So if you had 2x duplication and 3 drives, any given file would be on 2 of those 3 drives and that means if 2 of your drives suddenly failed at random then (assuming a bunch of equally sized files in a pool with default behaviour) in theory on average you'd have a 2 in 3 chance of keeping any given file.

Basically if you're using DrivePool just by itself, to completely eliminate the risk of losing any files if N drives simultaneously fail you need to use N+1 times duplication.

If you're using something like SnapRAID to provide protection for your DrivePool, you can (assuming a few details) instead dedicate N drives to parity protection to protect against N simultaneous drive failures - if I remember rightly this becomes more storage efficient when the total number of drives exceeds twice N; the tradeoff is that SnapRAID's parity drives need to be updated after any files change in the pool drives to protect those changes instead of DrivePool's real-time duplication feature, so if your files are regularly changing all the time it may not offer enough protection by itself. You could of course use both duplication and parity if you had enough drives.

Link to comment
Share on other sites

  • 0

I currently don't use SnapRAID, my only protection atm is Drivepool. I'm not too worried about the drives suddenly failing since I have scanner which should help when recognizing if one is going bad. I'll probably have to think on the 3x duplication but probably will end up doing it. Doesn't hurt getting a bit more protection.
 

P.S - Want to make sure I'm reading this correctly the N+1 in this situation would just be 3?

Edited by WingedAngel
Added something extra cause am dumb
Link to comment
Share on other sites

  • 0

Yes, to completely prevent any chance of data loss from 2 drives suddenly failing at the same time you'd need 3 times duplication. Note that scanner doesn't protect against sudden failures; that's why they're called "sudden". Scanner protects against the type of failures that'll take longer to kill your drive than you/drivepool will take to rescue any data you want off it.

Basically there are what I'd consider to be four types of drive failure:

  1. Sudden - bang, it's dead. This is what things like Duplication and RAID are meant to protect against.
  2. Imminent - you get some warning it's coming. Duplication and RAID also protect against this type, and Scanner tries to give you enough time for pre-emptive action.
  3. Eventual - no drive lasts forever. Scanner helps with this too by notifying you of a drive's age, duty cycle, etc, so you can budget ahead of time for new ones.
  4. Subtle - the worst but thankfully rarest kind, instead of the drive dying it starts corrupting your data. Scanner can sometimes give clues, otherwise you need some method of being able to detect/repair it (e.g. some RAID types, SnapRAID, PAR2, etc) or at least having intact backups elsewhere. DrivePool might help here, depending on whether you notice the corruption before it gets to the other duplicate(s).

If it helps any, I suggest following the old 3-2-1 rule of backup best practice, which means having at least three copies (production and two backups), at least two different types (back then it was disk and tape, today might be local and cloud) and at least one of those backups being kept offsite, or some variant of that rule suitable for your situation.

For example, my setup:

  • DrivePool with 2x duplication (3x for the most important folders) to protect against sudden mechanical drive failure on the home server.
  • Pool is network-shared; a dedicated backup PC on the LAN takes regular snapshots to protect against ransomware and for quick restores.
  • Pool is also backed up to a cloud provider to protect against environmental failures (e.g. fire, flood, earthquake, theft).
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...