Jump to content
  • 1

Subpools or Drive Groups Within a Pool


Guynamedbilly

Question

Hello again.  I'm trying to figure out if there is any way to subdivide drives within a single pool so that I can prioritize which ones duplicate data first. 

 

My scenario is this.  I've got two external USB3 JBOD drive carriages that hold 4 disks each.  If I just add all of the disks to a Drivepool, the data is spread across each of the drives fairly evenly no matter which carriage they are in.  I'd like to have the duplicated data be written to each seperate carriage whenever there is enough available space to do so.  This would provide better performance since they are not all going through the same USB interface, and also better security since if one carriage was damaged then the other would have a better chance to survive.

 

I know that you can limit which individual drives receive duplicated data,  is there any way to do that for groups of drives?

Link to comment
Share on other sites

Recommended Posts

  • 0

No, we don't have anything like that yet. 

 

Specifically we do plan on adding a "Duplication Grouping" feature to StableBit DrivePool in the near future.  Especially as it would work VERY well with StableBit CloudDrive, but also for external storage. 

 

However, we are a small company, and have limited resources.  Once StableBit CloudDrive has been released, Alex (the Developer) plans on going though the backlog of requests (including this) and addressing as many of them as we can.  

 

Unfortunately, we don't have an ETA on this. 

Link to comment
Share on other sites

  • 0

It's in Alpha. :)

http://dl.covecube.com/DrivePoolWindows/beta/download/changes.txt

 

  • [D] Added hierarchical pooling support:
    - Pools are now made up of either disks or other pools.
    - Each pool handles its own separate folder duplication settings.
    - Balancing can work over a pool part, but file placement settings are limited to the entire pool part only.
    - Circular pooling is not allowed (e.g. Pool A -> Pool B AND Pool B -> Pool A). Contrary to popular belief, this does not lead to infinite disk space.
Link to comment
Share on other sites

  • 0

Created an account here just to let you know this is only feature missing that I would love to see :-)

 

It's in the beta version, but this feature is very, very new. It may still have bugs (probably does). And we know that the UI needs an overhaul to handle things better, as this will get cause the UI to get complicated, fast.

 

But if you want to go ahead and test it out, by all means, do so.  Just let us know if you run into any issues with it. 

http://dl.covecube.com/DrivePoolWindows/beta/download/?C=N;O=D

 

Grab the 2.2.0.744 or 2.2.0.745 build, and give it a spin.

Link to comment
Share on other sites

  • 0

Not sure I am using this new feature but are there any recommendations as to what features should be enabled/disabled?

 

My setup

Sata card 1, 7 disks, Pool A, no duplication

Sata card 2, 7 disks, Pool B, no duplication

 

Pool A + Pool B = Pool C, with duplication enabled so two copies of all files on each sata card.

 

I have changed a disk in Pool B and it is balancing at the moment, I think this is causing issues with Duplication on Pool C as it starts runs finishes and starts again. Should I turn balancing off on Pool A & B as it really isn't required, or just let it all work itself out over time?

Link to comment
Share on other sites

  • 0

It's in the beta version, but this feature is very, very new. It may still have bugs (probably does). And we know that the UI needs an overhaul to handle things better, as this will get cause the UI to get complicated, fast.

 

But if you want to go ahead and test it out, by all means, do so.  Just let us know if you run into any issues with it. 

http://dl.covecube.com/DrivePoolWindows/beta/download/?C=N;O=D

 

Grab the 2.2.0.744 or 2.2.0.745 build, and give it a spin.

Understood.  Retaining elegance as flexibility increases is, um, hard.

Not sure I am using this new feature but are there any recommendations as to what features should be enabled/disabled?

 

My setup

Sata card 1, 7 disks, Pool A, no duplication

Sata card 2, 7 disks, Pool B, no duplication

 

Pool A + Pool B = Pool C, with duplication enabled so two copies of all files on each sata card.

 

I have changed a disk in Pool B and it is balancing at the moment, I think this is causing issues with Duplication on Pool C as it starts runs finishes and starts again. Should I turn balancing off on Pool A & B as it really isn't required, or just let it all work itself out over time?

I don't have any answer for you because I'm new here.

 

But I'm curious, is your intent to have 1 copy of each file in pool A and 1 copy in pool B?  Thats why I want the feature.  I want my duplication to happen across interfaces for speed.

 

This statement makes me unclear "two copies of all files on each sata card"

Link to comment
Share on other sites

  • 0

I'd recommend letting it sit, as over time, yes it should work itself out. 

 

That said, I'm bugging this, because it looks like we'll need to make sure that the service is much more aware of what is going on to prevent conflicts like this. 

 

Could you get tracing logs from when this is happening, though? 

http://wiki.covecube.com/StableBit_DrivePool_2.x_Log_Collection

 

 

Understood.  Retaining elegance as flexibility increases is, um, hard.


I don't have any answer for you because I'm new here.

 

But I'm curious, is your intent to have 1 copy of each file in pool A and 1 copy in pool B?  Thats why I want the feature.  I want my duplication to happen across interfaces for speed.

 

This statement makes me unclear "two copies of all files on each sata card"

 

Yeah, and this is far from a simple thing.

 

From the sounds of it, he has two controller cards.  He wants one set of files on each controller. That way if one controller "goes out", all of the data is stored on the other controller, so all of the data is still up and running. 

 

There are others that want this for internal vs external, multiple external enclosures, local vs cloud, etc.   Basically, it's a way to create "discrete units" to manage the duplication better. 

Link to comment
Share on other sites

  • 0

I'd recommend letting it sit, as over time, yes it should work itself out. 

 

That said, I'm bugging this, because it looks like we'll need to make sure that the service is much more aware of what is going on to prevent conflicts like this. 

 

Could you get tracing logs from when this is happening, though? 

http://wiki.covecube.com/StableBit_DrivePool_2.x_Log_Collection

 

 

 

Yeah, and this is far from a simple thing.

 

From the sounds of it, he has two controller cards.  He wants one set of files on each controller. That way if one controller "goes out", all of the data is stored on the other controller, so all of the data is still up and running. 

 

There are others that want this for internal vs external, multiple external enclosures, local vs cloud, etc.   Basically, it's a way to create "discrete units" to manage the duplication better. 

 

 

Yes sorry, one copy of each file on each sata card. Two reasons for this, the obvious duplication of data on separate cards in case of failure. The second, sending and receiving data from 2 cards at the same time should be faster than duplicating on the same card. The duplication seems to have settled down after the balancing, so I will leave the logging. But it seemed to be balancing pool a or b and duplicating pool c at the same time. It would be better if those tasks functioned individually.

 

An idea for the future, detect the drive pool within a drive pool and automatically disable features not required or prioritize the tasks based on user preferences. I.E. Disk replacement or addition will start a balancing process, in the case of a replacement disk and data is not duplicated in pool c, it would be better to duplicate pool c than to balance pool a, after successful duplication, balance pool a. This would also create less balancing work as the duplication process should use the empty drive in pool a therefore requiring less balancing.

 

At the moment, I have Pool A and B configured with all performance options enabled, balancing for drive overfill and volume equalization, no rules.  Pool C has performance options enabled except bypass system filters, pool duplication enabled, no balancing and no rules.

Link to comment
Share on other sites

  • 0

Hmm monitoring a problem at the moment.

 

Pool Duplication is enabled on Pool C, I had a look at folder duplication on Pool C, set the $Recycle.bin to 1x and then noticed the unduplicated data growing. Rechecked the folder duplication and random folders were set to 1x. Tried changing a single folder back to 2x and got a task failed error. Disabled pool duplication, enabled it again and rebooted the server and checked folder duplication is all now 2x. It now seems to be duplicating the data again.

Link to comment
Share on other sites

  • 0

Hmm monitoring a problem at the moment.

 

Pool Duplication is enabled on Pool C, I had a look at folder duplication on Pool C, set the $Recycle.bin to 1x and then noticed the unduplicated data growing. Rechecked the folder duplication and random folders were set to 1x. Tried changing a single folder back to 2x and got a task failed error. Disabled pool duplication, enabled it again and rebooted the server and checked folder duplication is all now 2x. It now seems to be duplicating the data again.

A simple solution for what you want would be to disable all duplication within drivepool and use a separate program to sync the two pools, thereby achieving the redundancy.

 

I'll be doing this with a program I've been using for years to backup PC's to a NAS, http://www.smartsync.com/

 

The only downside I can see here is you won't get the read boost if drivepool isn't doing the duplication.

Link to comment
Share on other sites

  • 0

Hmm monitoring a problem at the moment.

 

Pool Duplication is enabled on Pool C, I had a look at folder duplication on Pool C, set the $Recycle.bin to 1x and then noticed the unduplicated data growing. Rechecked the folder duplication and random folders were set to 1x. Tried changing a single folder back to 2x and got a task failed error. Disabled pool duplication, enabled it again and rebooted the server and checked folder duplication is all now 2x. It now seems to be duplicating the data again.

 

If you see this happen again, enable drive tracing on all of the pools and reproduce. 

http://wiki.covecube.com/StableBit_DrivePool_2.x_Log_Collection

Link to comment
Share on other sites

  • 0

Wow! That you guys are working on this really makes me happier. Been waiting for over 1.5 yrs now I think.

 

From the changes.txt, I gather that I could do something like this:

1. Create Pool1  from HDD1 & HDD2 - no duplication

2. Create Pool2 from HDD3 & HDD4 - no duplication

3. Create Pool3 from Pool1 & Pool2 - x2 duplication?

 

Could I also add HDDs to Pool3? Would seem weird but it might be nice for SSD caches. Although I think I'd rahter add an SSD cache to Pool1 and Pool2 seperately.

 

Had to laugh about "Circular pooling is not allowed (e.g. Pool A -> Pool B AND Pool B -> Pool A). Contrary to popular belief, this does not lead to infinite disk space." - :D

Link to comment
Share on other sites

  • 0

Yeah, it's something that Alex has been wanting to add for a while (or something to allow "grouping" in general). 

 

As for the pools, yes, you could do that, exactly.  You could set the duplication status on the underlying pools as well.

 

And yes, you should be able to add drives to "Pool 3" (the pool of pools), as well. Just remember, that the file systems have to match, so the "pool of pools" will be NTFS, as that is what the pool registers as.  

 

And yeah... You notice how it skipped two builds?  I believe that Alex didn't add a check for circular pooling in the first build, so this was a bit of self deprecating humor. :)

Link to comment
Share on other sites

  • 0

Not sure I am using this new feature but are there any recommendations as to what features should be enabled/disabled?

 

My setup

Sata card 1, 7 disks, Pool A, no duplication

Sata card 2, 7 disks, Pool B, no duplication

 

Pool A + Pool B = Pool C, with duplication enabled so two copies of all files on each sata card.

 

I have changed a disk in Pool B and it is balancing at the moment, I think this is causing issues with Duplication on Pool C as it starts runs finishes and starts again. Should I turn balancing off on Pool A & B as it really isn't required, or just let it all work itself out over time?

 

Have you seen this again, or are able to replicate it at will? 

 

 

And how does DPCMD behave when you want to get a dump of the MOAP (Mother Of All Pools)?

 

It dumps one pool at a time. As per normal. You specify which pool you want to run this on, and it grabs the info from THAT pool only. 

 

That's how it's always worked.  So, if you want a "master list", I guess you'd want to run it on the main pool, and then run it on the sub-pools.

 

Otherwise, the pool of pools will treat the underlying pools as normal disk. 

Link to comment
Share on other sites

  • 0

Have you seen this again, or are able to replicate it at will? 

 

 

 

I have just upgraded to 746. Changed the $recycle.bin to 1 x dup and checked the pool and folder duplication was still 2x for everything else. Also,  the task failed error has gone as per the 746 change log. I am doing some logging at the moment so will post it through it I see any odd behavior.

 

Edit: Actually, something I am seeing a lot is that the real time duplication doesn't seem to work. This may sound more drastic than it actually is. I am seeing large files copied or moved or overwritten in the pool and then the master pool will say, as an example 8.81GB is not duplicated, so it kicks off a duplication run, when it is finished I have 5xMB unduplicated. I am wondering it the <random file names>.copytemp from pool A and Pool B are being picked up when and if those pools balance themselves and if a balance in Pool A or Pool B will affect the Duplication in the Master Pool. I will post some logs when I have the time to work on this.

 

Edit 2: Logs uploaded for continual duplication runs.

Link to comment
Share on other sites

  • 0

So DPCMD on the POP will show two devices which are in fact Pool1 and Pool2? Not the actualy underlying physical devices?

 

Yes.  I believe that is the case, I haven't had a chance to really play around with this, but based on how I know the software works... this should be the case.  (basically, it treats the pools like any other, normal disk) 

 

I have just upgraded to 746. Changed the $recycle.bin to 1 x dup and checked the pool and folder duplication was still 2x for everything else. Also,  the task failed error has gone as per the 746 change log. I am doing some logging at the moment so will post it through it I see any odd behavior.

 

Edit: Actually, something I am seeing a lot is that the real time duplication doesn't seem to work. This may sound more drastic than it actually is. I am seeing large files copied or moved or overwritten in the pool and then the master pool will say, as an example 8.81GB is not duplicated, so it kicks off a duplication run, when it is finished I have 5xMB unduplicated. I am wondering it the <random file names>.copytemp from pool A and Pool B are being picked up when and if those pools balance themselves and if a balance in Pool A or Pool B will affect the Duplication in the Master Pool. I will post some logs when I have the time to work on this.

 

Edit 2: Logs uploaded for continual duplication runs.

 

If your'e able to reproduce this, could you run the "dpcmd X:\$RECYCLE.BIN", where "X:" is the top level pool.   both before and after, to verify the duplication level.

Ignore the list of folders, and look at the "expected number of copies" value. 

 

 

If you can reproduce this, then enable file system logging, reproduce and then turn off logging. Upload the logs,

 

As for the real time duplication, I'll see about reproducing this later. 

But could you use the "dpcmd list-pool-fileparts" command on the specific file, to verify if it's at both locations? 

If that's having issues, then please do the above (enable logging, reproduce the issue, turn off logging and upload the logs). 

 

 

As for "copytemp", this is what we use for balancing and duplication. This way, if something happens (bSOD, crash, sudden power loss, etc), that there isn't a partial file on the destination, leaving you with a corrupted file.  So if you're seeing this, it means "it's doing something". 

Link to comment
Share on other sites

  • 0

Ok. I am about to upload some more logs. This is the process I have been doing to cause the pool issues. My Setup again. Pool A and Pool B 7 disks each no duplication. Pool C contains Pool A and B with duplication enabled.

 

There have been files in the pool that according to drivepool have not been duplicated, I thought the directory structure might be to long so renamed a folder say from Stuff to St, note the contents of the folder was around 100k+ small files ranging from1kb to 500kb. This seems to kick off some events. The other time I have seen this process I was deleting a large directory structure full of 500k metadata files, txt, xml, jpg etc.

 

My visual monitoring suggests

- Pool A and B are fine.

- Pool C has issues, a duplication run starts.

- I see the unduplicated space growing.

- I check the folder duplication, even though I use pool duplication, random folders are set to 1x not 2x. 

- I wait and eventually the unduplicated space stops growing.

- Checking the folder duplication again I see that all folders are back to 2x.

- Then duplication starts again after drivepool checking and saying it is not consistent.

- I have manually check the files and folder structure and see there are missing files and folders.

- After a day or so the data is duplicated again.

 

So it looks like large processes to delete or rename or move files cause strange events to happen with duplication settings. Hopefully someone can reproduce this issue as I think it has happened about 4 or more times now.

Link to comment
Share on other sites

  • 0

Well after leaving the server for a few days it seems to have stopped the repeated duplication runs. There is only 4kb that is unduplicated now.

 

Does Drivepool support the Windows 2016 long path names feature, they have finally got rid of the 260 char limit.

 

Can I use DPCMD to find the "other" files on a in a pool or pool drive?

Link to comment
Share on other sites

  • 0

for the long file names, StableBit DrivePool doesn't care. We use UNC paths (\\?\Volume{GUID}\PoolPart.GUID2\path\to\content.txt), which has a 32k character limit.  

 

the 260 character path limit is a limitation of the Win32 API, which explorer relies on. 

 

 

As for the 4k remaining... 

http://community.covecube.com/index.php?/topic/1587-check-pool-fileparts/

 

This should pick up what file is not duplicated.

Link to comment
Share on other sites

  • 0

So I moved to 2.2.0.746, all good. I now have two unduplicated Pools of single 4TB HDDs, partitioned as 2x2TB and a third Pool that consists of the two with duplication.

 

Big upside for me: the issue where duplicates can be stored on the same physical HDD is now "solved" (well, worked around). So because this now allows me to continue to use 2TB volumes (given the limitation of WHS2011 Server Backup) with larger Pools, I can keep WHS2011 for quite some time. This is a real money saver for me. Thanks!

 

I would have chosen a different implementation as I now need three Pools to accomplish this. I would rather have had the option to define strings within a single Pool. However, the current implementation has the benefit that it uses, I would think, sort of the same code as opposed to something new and the additonal overhead (I assume that for each I/O, the service now needs to make three calls) presumably is very small anyway.

Link to comment
Share on other sites

  • 0

Well, glad to hear it! 

 

And yeah, the groups within the pool would be easier in a number of ways.  But this allows for a lot more flexibility without adding massive complexity to the balancing code.   

 

As for the IO, its less of "calls", but yeah. The IO calls are redirected to the underlying disks (much like a reverse proxy for a web server). 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...