Jump to content
  • 0

Rules for duplication and 'balancing'


PoolBoy

Question

Disk-A = Big speedy disk I want to contain all of my files.
Disk-B = Small speedy disk. Should be used as DrivePool sees fit.
Disk-C = Small speedy disk. Should be used as DrivePool sees fit.
Disk-D = Slow disk. Likely also the most unreliable disk. Should only be used when other disks are full.

So my rules are:
1 copy of all my files on Disk-A
1 copy of my files spread over the other 3 disks.
Keep Disk-D as empty as possible at all times.

Not sure if it qualifies as balancing but when space gets freed up on Disk-B and/or Disk-C, DrivePool should move files from Disk-D
 

Link to comment
Share on other sites

12 answers to this question

Recommended Posts

  • 0

You could create a Pool from disks B, C en D, no duplication, let's call this Pool Q. Then you can create a Pool P that consists of Disk A and Pool Q, duplication x2. That ensures duplication with one copy on disk A. Personally, I would create a Pool R with disk A so that it is easy to add another speedy disk. I would want each Pool Q and R to have a enough space so that the failure of one can easily be corrected by DP.

I think Stablebit has an add-in that prioritises certain disks in a Pool, that might help for allocation between disks B, C and D.

Link to comment
Share on other sites

  • 0

Thanks for the advise Umfriend!

I now have the following:

Everything on a single disk
+
A copy on pool which consists of 3 disks.

 

Now I want to join them in a new Pool that supports duplication.
I do know a how to create a pool but am a bit afraid for what DrivePool does if I put the disks in one pool.
It will most likely notice all files are duplicated but at that point I haven't had the chance of configuring duplication.
Will DrivePool automatically start wiping one copy?

 

Link to comment
Share on other sites

  • 0

None of the drives are in a Pool yet? Then it is simple: No. In fact, DP will not delete any data on any HDD when you add it to the Pool and such data will not show up in the Pool(s).

Imagine a disk D:\ with data on it in the root and a folder D:\Data. When you add that disk to a Pool, DP will create a hidden PoolPart.* folder. So you would have:

D:\
D:\Data\
D:\PoolPart.*\

Only what is in the PoolPart.* folder is part of the Pool and everything that has to do with duplication and balancing only applies to the contents of that folder. Root and D:\Data will not be affected in any way.

So let's assume you have disks D:\ to G:\ with D:\ being the big fast disk. What I would do is:

1. Create Pool Q, add only disk D:\
2. Through explorer, _move_ the data on D:\ that you want to be in the Pool to the hidden D:\PoolPart.* folder.
3. Create Pool R, add disks E:\ F:\ and G:\
And here it becomes tricky. If you now create a Pool P by adding "disks" Q and R, with x2 duplication, DP will think that R does not have a duplicate (because the PoolPart.* folders are empty) and copy everything from Q to R. You *could*, I guess, first move the data on E:\ to G:\ (disk by disk) to the PoolPart.* folders. However, I am not sure how DP will react to potential differences in time stamps on the files (date created/modified).

So, personally, I would copy all data from E to G to some other disk (external perhaps) and then format them first. Then do step 3 and then
4. Create Pool P using Q and R and set duplication to x2.
5. Let DP do its magic (which may run for quite a bit of time).

If there is no possibility to backup the data but the data on Q is less in size then the free space on R, then you could still do steps 4 and 5 and then, after you are satisfied that everything is in order, delete the data on E to G that is not in the PoolPart.* folder (but was present on D:\ or Q:\, don;'t delete data that you have not elsewhere).

Failing that space, you could consider simply formatting and do steps 4 and 5 with the idea that you still have a copy on D:\ (=Q:\) and will have duplicates again soon. Alternatively, you could move on those disks but again, I am not sure what issues you might run into. As an example, say there is a file called "MyFile.doc". If on Q:\ it is in folder "MyFiles" and on R:\ it is in folder "MyBackupFiles", then DP will not know that these are the same files. Rather, it will create duplicates of both causing 4 copies to exist.

One last thing: If you ever use the move files trick to "seed" a Pool, remember tostop the DrivePoolService before you do that and restart it when finished.

Hope this helps.

Link to comment
Share on other sites

  • 0

So your suggestion is:
Build a pool with 3 small disks.
Build a pool with 1 big disk.
Put both pools is  new pool. (MasterPool)
Then fill that pool.

Then the MasterPool consists of just 2 disks and duplication always is done using both pools.
That's good, because what I want.
Right?


Right now I have 2 none duplicating pools.
The pools aren't related in any way.

Getting to longwinded to explain in detail, but I did built the pool consisting of the small disks in such a way one disk is 70% empty (as I want).
I don't know how that can be done if I give DP full control because it wil most likely spread the files evenly over all small disks.

But safety before all!
If my method carries any risk I'll just wipe the pool with small disks.
Build a MasterPool consisting of a 2 pools.
Then let DP just fill and duplicate.
That gonna take a whole day....
It won't place the files as I like them to be placed but data is secured.
Then later I may figure out file placement rules that force one disk to be as empty as possible.


BTW all my 3TB source files are on the big 8TB disk that will be part of the pool.
So following your advise.
Set up everything but leave it empty.
Stop service.
Move files.
Start service.
Configure duplication.
Wait.....

 

 

I found a rule that sets how put full disks may become preferably.
If I could set that on per disk basis my problems would be instantly solved.
I would set one disk to 1% and the others to 100%.
That would force balancing as I want it.
But the option doesn't work per disk but is global :-(

Link to comment
Share on other sites

  • 0
1 hour ago, PoolBoy said:

I found a rule that sets how put disks may become preferably.
If I could set that on per disk basis my problems would be instantly solved.
I would set one disk to 1% and the others to 100%.
That would force balancing as I want it.
But the option doesn't work per disk but is global :-(

Rather than setting the slow faulty drive to 1%, why not limit the placement of the folders to not include that drive...

...however, along with having duplication enabled & setting the File Placement of folders to 99% on the 3 primary drives...

...set the Balancers to prioritise the "Drive Usage Limiter" above the "Duplication Space Optimizer" - whilst disabling the "Prevent Drive Overfill" option...

…& enable the "Allow files to be placed on other disks if the selected disks are full" option in the File Placement.

 

Well, this should only allow data in the selected folders to be placed on the 4th drive if there's not enough space to duplicate to either the 1st or the 2nd & 3rd drives (it's not clear which is smaller) - & then move the data off of the 4th drive once there's space to do so...

...whereas your 1% would tend to result in the 4th drive always having some data on.

 

As to creating a global setting ttbomk there's 2 potential options.

1. Either you could create a single folder that you place all of the other data folders on the pool into - & then do the placement limitation & whatnot from above...

...this would cover all of the data & folders - but would use up some of the max path length.

(you could also map that main folder to a drive letter)

2. OR you could create a "*.*" Rule in the File Placement - & then do the placement limitation & whatnot from above...

...which would cover all of the data, not the folders - & would save a bit of the path length.

Link to comment
Share on other sites

  • 0
3 hours ago, PoolBoy said:

So your suggestion is:
Build a pool with 3 small disks.
Build a pool with 1 big disk.
Put both pools is  new pool. (MasterPool)
Then fill that pool.

Then the MasterPool consists of just 2 disks and duplication always is done using both pools.
That's good, because what I want.
Right?

 

Yes, that is the idea.

I don;t think that file placement rules are the best way deal with this for you. Rather, I would try the Ordered File Placement balancer. It is not installed by default, you'll need to download and install a plug-in (https://stablebit.com/DrivePool/Plugins). Read the notes carefully as the default behaviour is not what you want but it has the options to get it suited for you I think. Caveat: I have not used this plug-in.

Oh, and on:

So following your advise.


Set up everything but leave it empty.
Stop service.
Move files.
Start service.
Configure duplication.
Wait.....

I would configure duplication first, so set-up, set duplication, stop service, move files, start service, wait.

I have to say, I am rather curious about the exact HDDs you are using. And, to be frank, if a disk is suspect I would at the least ensure I have enough disks in any Pool to ensure that, should it fail, DP has the space available to reduplicate to other disks (and as such I would have a second large fast HDD in the Pool with the 1 big disk). Sure, you would already have duplication but in case of a disk failure, say the big fast one, you would still suffer downtime as the Pool will be in read-only mode and you would not actually have duplication until the matter is resolved.

Also, got scanner? I would advise it.

Link to comment
Share on other sites

  • 0

Revelation time for me!
I didn't even know there was plugin repository...
Yeah the plugin you suggested sounds exactly what I need.
"StableBit DrivePool will place new files in a way that will fill one disk at a time"

Yeah, I have scanner. Without it I wouldn't even have mentioned drive problems because it never gave me any problems. But now I know I want to play safe.

Why I even use all that junk...? :-)
Over the years I've gathered a lot of disks. Mostly replaced by SSD's and bigger and faster HDD's. I could just trash them, but with DrivePool I can give them a second life.
The data isn't inreplaceble. I can just redownload. But that will take time and trouble.
So by using my old disks I add security without spending $, and I also add some speed.
If a disk fails I still can add another disk to the pool so DrivePool can duplicate the files that were on the crashed disk.
Those other disks I do not want to use permanently because I have a certain disk usage strategy based on disk speed.
Something like: SSD for system, programs and frequently used data. Big speedy HDD for secondary storage. The pool we are discussing is tertiary storage.

BTW what's really important to me, no matter on which disk it is, it backupped hourly and daily and is stored on 3 different disks not being part of any pool.

 

Link to comment
Share on other sites

  • 0

Creation is already done. If I have to redo it I'll use my trick again.
I just remove the 'faulty' disk from the pool. When the other disks are full I add that disk again.
That does what the plugin does.

I hope the plugin, in cooperation with the balancer actively tries to keep the 'faulty' disk empty.
Meaning if from another disk 100GB is deleted, 100GB is move to that disk from the 'faulty' disk.

Link to comment
Share on other sites

  • 0
11 hours ago, Umfriend said:

Yes, that is the idea.

I don;t think that file placement rules are the best way deal with this for you. Rather, I would try the Ordered File Placement balancer. It is not installed by default, you'll need to download and install a plug-in (https://stablebit.com/DrivePool/Plugins). Read the notes carefully as the default behaviour is not what you want but it has the options to get it suited for you I think. Caveat: I have not used this plug-in.

Whilst this would certainly work as an alt method, my reading of the plugin is that this benefit -

"Files copied at the same time will tend to be on the same disk. Because those files were copied at the same time, it stands to reason that they might be related. It can be beneficial, in terms of file recovery, to have related files be placed on the same disk."

- would only actually work long term if you were only ever adding files to the pool; never deleting them.

(now it 'may' be the case that this is going to be the OP's actual usage, but I can see nothing that states that it is)

 

Well, as soon as you've filled the 1st drive & are onto a 2nd drive, by deleting something from the 1st then either, assuming there's space for one or more (but not all of) the files the next set of data you write will end up being split between the 1st & 2nd drive - or the plugin will randomly move data from the 2nd to the 1st to fulfil the rule...

…& either way then, whilst you'd have a lovely arrangement on day one, this would deteriorate over time.

&, at least in theory, you 'could' end up in a situation where it was spending inordinate amounts of background time balancing & rebalancing to move data from the 2nd drive to the 1st... …& the 3rd drive to the 1st or 2nd... …&... ...as you deleted &/or replaced data over time.

I mean it's not even the case that the OP would have pairs of drives with the same data on - since they're using 1 big drive for one half of the duplication & (up to) 3 small ones for the other.

 

Yeah, the only way I'm aware of to actually keep data fixed to a drive within an environment where you are deleting/replacing files, is to not fill a/the drive/s - either by having spare capacity on a single drive or to have different placement rules for different folders/data types to keep spare capacity on all of the drives.

Link to comment
Share on other sites

  • 0

I finally got it working the way I like it.

@PoolDemon

Files copied at the same times ending up on the same disk certainly is a bonus. But I must admit I never considered that when I started this thread.
 

Quote

this would deteriorate over time.

That must be the case unless DP is really smart and looks at the filenames when placing the files.

Link to comment
Share on other sites

  • 0

The thing is, the question was not to keep files together on a disk, it was to not have them on disk D to the extent possible. File Placement Rules may do that for you but I am not sure once disks B & C are full. And keeping files on the same HDD using FPR will not work with a *.* rule. You would actually need to work out which folders to place where etc. and it could cause folder X, targeted for C, being split between Disk B and D when C is full.

Link to comment
Share on other sites

  • 0
2 hours ago, Umfriend said:

The thing is, the question was not to keep files together on a disk, it was to not have them on disk D to the extent possible. File Placement Rules may do that for you but I am not sure once disks B & C are full. And keeping files on the same HDD using FPR will not work with a *.* rule. You would actually need to work out which folders to place where etc. and it could cause folder X, targeted for C, being split between Disk B and D when C is full.

Right, addressing these in a way that hopefully makes sense...

Firstly, this was looking at using the plugin you'd suggested to fulfil the OP's initial requirement that as little data as possible should ever be stored on the shonky 3rd drive. 

So they'd need the 'move existing data' bit enabled, otherwise it wouldn't move anything off of that drive if space became available on either of the other 2... ...which would be worse than the OP's idea of setting up a 99%/99%/1% rule, as that would at least move some data if it more than 1% of the capacity was filled.

 

Secondly, part of the comment about the plugin was thinking aloud about how what it says it does vs how it'd practically work overall - as I think part of the description for it is misleading outside of an add data only environment - as it can't reasonably keep each set of data written together for all & ever whilst simultaneously placing data onto earlier drives where deletion's occurred &/or rebalancing to have the drives filled in order.

(& using folder placement rules appropriate to the various drives' capacities, so that no disks were ever filled, was the only way that I could think of keep most of the data that would be written together to specific drives in the way that the plugin description described - which had nothing to do with using the *.* rule)

So, whilst I fully accept that this specific element of what I'd written may have been OT, depending on what the OP actually saw as being valuable about that plugin of course, I think it was still worth thinking about...

(Thirdly)

...however, x amount of it was also needed to explain why someone 'could' end up in a situation where, having filled the first drive (obviously I'm talking about one half of the duplication in the OP's situation) & had data on one or more other drives in the sequence, the thing was rebalancing semi-constantly to move data towards the lowest ordering of the drives. (see the first point)

 

Now, the key point for the OP's initial requirement was that this 'could' end up being a significant disadvantage of using that added plugin depending on their usage...

...whereas, using the standard balancers & adding rules akin to how I'd suggested to achieve the OP's original task then the only time DP would need to rebalance stuff in normal usage would be if BOTH disks 1 & 2 were filled to 99% & more data was added; which would obviously hit the shonky 3rd drive...

…& then some data was deleted so that there was space for that extra data to be moved off of the 3rd drive & onto either or both of the other two.

 

Anyway, it's obviously just a discussion - but I believe that my approach would be less likely to cause any unforeseen consequences; d.t. the way the plugin you're suggesting would have to be set up to fulfil the OP's initial requirements.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...