Jump to content
  • 0

Drives dedicated to MECE file set vs duplicates file set?


blueman2

Question

I changed the configuration on my system so that I have one (slower) drive dedicated to duplication.  My rationale is that  using Read Striping, I should get better access times since I will always have one copy (either the single on if no duplication, or the 'original' of the duplicate) on a faster drive.  And only duplicates on the slower drive.  Also, the slower drive has a few re-allocated sectors, and if it does die, I will not lose any unduplicated material if I keep only duplicates on it.    

 

However, the system is not balancing correctly for one of the drives.  Here is balancer setup:

 

jybbg0T.png

 

 

Any my Balancer settings

 

1kiXLm5.png

 

 

And here is where I am stuck:

 

 

mQw0aRK.png

 

 

No matter how many times I tell it to rebalance, it will not do it.  I will spend about 10 minutes doing the "Bucket Lists" thing, but then just stops and the "Pool no optimal" state.  

 

I tried deleting all the program data and re-installing, but after much time reindexing, it just returns to the same not optimized state.  

 

Any ideas what is going on?  

 

 

Link to comment
Share on other sites

11 answers to this question

Recommended Posts

  • 0

It's because you have Data1 set for only unduplicated files. Your settings are telling DrivePool that you only want duplicated files on Data3 but the files are duplicated so they have to go somewhere and that's why your pool isn't "balanced". You can't think of the duplicated files as a second copy, it just means that a file is duplicated and should have 2 copies at all times, not that the specific file on Data3 is a duplicate of some file.

 

To fix your issue, set Data1 to have duplicated and unduplicated files so you just need to check the duplicated box. You might have to remeasure afterwards.

Link to comment
Share on other sites

  • 0

It's because you have Data1 set for only unduplicated files. Your settings are telling DrivePool that you only want duplicated files on Data3 but the files are duplicated so they have to go somewhere and that's why your pool isn't "balanced". You can't think of the duplicated files as a second copy, it just means that a file is duplicated and should have 2 copies at all times, not that the specific file on Data3 is a duplicate of some file.

 

To fix your issue, set Data1 to have duplicated and unduplicated files so you just need to check the duplicated box. You might have to remeasure afterwards.

 

AH!!!  Of course!  Thanks so much.  Now that explains why one drive was able to be cleared (F) but other could not.  By definition, you cannot have just 1 drive with duplication.  So, yeah.  Embarassed.  My error was in misunderstanding the terminology of duplicated and unduplicated.  I was thinking that Duplicated referred only to the 2nd file, and unduplicated referred to single copy files AND the other copy of a duplicated fileset.  

 

I still have to think about how to accomplish was I was trying to do, however.  I want to have drives E and F to have a copy of every file in my pool, but never more than 1 copy.  Drive G would be for copies of files that are duplicated on drive E and F.  Does that make sense?  I am close with the settings I have, but all files that are duplicated are forced to reside on either E or F, but not mixed across both.  Perhaps that is not a problem if I balance E and F on a regular basis, though.  

Link to comment
Share on other sites

  • 0

You can get your desired effect easily if you setup some rules. Make a rule that forces all files to goto your G drive.

 

To do this, you make a custom rule that looks like this: \*

Then click on the rule and only have G selected. You'll also need to go back to your balancer settings and make sure that duplicated and unduplicated are check for all 3 drives.

 

What this rule does is tell DP that you want every file placed in the pool to be stored on G and because you have duplication on, it will make a copy of the file on either E or F.

 

I just realized with this setup you will have unduplicated files on G also, but to remedy that just make a custom rule for each folder. If you want, we can use TeamSpeak to figure out the best way.

Link to comment
Share on other sites

  • 0

I have about 100 rules setup to place my files. The file placement rules are the best thing since sliced bread.

 

I have 72 rules just for my tv shows.

 

\ServerFolders\Videos\TV Shows\The A*

\ServerFolders\Videos\TV Shows\A*

 

I have these 2 rules setup for every number and letter. These rules make sure that shows like The Americans (2013) don't get placed on the drive for shows beginning with T.

Link to comment
Share on other sites

  • 0

I just ran across this active thread: Server Backup and duplication question.  While I am coming at it from another perspective, the feature that we need is really the same one.  The linked post comes at it from a backup perspective.  I am coming at it from a desire to use a slow, somewhat questionable health disk (or disks) only for duplication and keep only single copy of any file on all other disks.  

 

In both cases, we want to designate a set of drives (a sub-pool) that will never contain duplicate copies of a file. Every file in the top level Pool will exist once and only once within that sub-pool.  (a MECE set of files).  Then another sub-pool of drives that will only contain duplicate copies of files.  The number of drives in this 'duplicates' subpool will also define the maximum number of copies possible (1 drive = 2X, 2 drives = 3X, etc).

 

With 3 disks, I am able to do this.  I set #1 to Duplicated and Unduplicated (default).  Set #2 to Unduplicated only, Set #3 to Duplicated only.  #3 drive is my slow, questionable health drive that I want to contain all my duplicates.  This gives me just what I want.  Within drives #1 and #2, each file exists once and only once.  And every file exists somewhere on these 2 disks.  A nice MECE set.  Drive #3 contains only duplicates for those folders where duplicates is turned on.  

 

However, if I have 4 or more drives, things get more complicated.    

 

So to sum it up, I want to be able to designate a MECE sub-pool and a Duplicates only sub-pool.  I can do the later with existing functions.  I cannot do the MECE pool in any way I know of, unless I have just 3 drives.  

 

Thoughts?

Link to comment
Share on other sites

  • 0

However, if I have 4 or more drives, things get more complicated.    

Definitely. 

 

However, the File Placement Rules can still help with this. You could use it to isolate the contents of any given folder to just two disks. It would take a lot more micromanaging to successfully accomplish this, but it would work.

 

Another option is multiple pools. Either an internal and external pool, and software to sync them periodically (instead of backup, though some software does support versioning or at least, date variables for the destination.

That, or a separate pool for each segment.

Link to comment
Share on other sites

  • 0

File Placement I think is a way to do this, but it is not elegant or easy at all.  And ongoing maintenance of the locations will be required as my storage changes.  I will play with that.  

 

I would love it if this feature could be added to DP in the future.  It could even be as simple as adding a rule option for every drive that says:

 

"Use this drive exclusively to keep a copy of every file that is targeted for duplication".  If you do this for 1 drive, this will allow 2X duplication.  If for 2 drives, it will allow 3X duplication.  And in all cases, the selected drives will contain only duplicate data, and other drives will not contain multiple copies of any single file.  

 

I have to wonder if this functionality might be available in "Product 3"?  Since that product apparently plans to enable use of pools located on remote (i.e. cloud) devices, it would be logical to assume you would need to be able to set some of your data to local only, and some to local + remote.  You would use remote pool only for duplication, and always keep 1 copy local for quick access.  

 

What is the process for making such a feature request?  

Link to comment
Share on other sites

  • 0

Consider it requested. :)

(https://stablebit.com/Admin/IssueAnalysis/6317)

 

And the process outlined definitely isn't simple. Hopefully, we can find a better way to let you do this.

  Thanks.  And yes, my choice of words were bad.  Nothing simple at all about this as I think more deeply about how to implement (without causing other issues).   

Link to comment
Share on other sites

  • 0

Well, the choice of words wasn't bad, but it's definitely a complicated idea, especially if you're already very familiar with how DrivePool works.

 

And unfortunately, implementing this would be very difficult at best. It would take a considerable rewrite to the entire code.

Because we don't have a clear definition of "original" and "duplicate", it means that we would have to completely rewrite the balancing code and the file placement code to take these into account. 

 

This has been requested a lot, because it does make backup easier. But because of the shear amount of rewriting, we've avoided doing so.

 

 

And it's because of these reasons that I generally recommend a secondary pool, and a file based sync utility. It's not perfect, but it's considerably easier.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...