Jump to content
  • 0

Confused about Duplication


mcrommert

Question

Started a duplication of my 40 tb drivepool and read this answer which confused me and seems to contradict how i'm going about this 

 

I made a 40 tb clouddrive and thought that if i selected the original drives as unduplicated and the cloud drive as duplicated and selected duplication for the entire array it would put a copy of every file in the original pool on the cloud drive.  If a drive failed i could easily rebuild from the cloud drive. Here are some settings

 

 

image.png.0b339697fbb38eca0bed095a0aa52cb6.png

In that other example you told the user to make another pool with the original pool and the cloud drive and set duplicated and unduplicated on both.  I am missing the (probably easily apparent) issue with my setup.  Also if i do need to do that what is the most efficient way to start that work without losing anything or having wasted time uploading to cloud drive

image.png

Link to comment
Share on other sites

6 answers to this question

Recommended Posts

  • 0
21 hours ago, mcrommert said:

I made a 40 tb clouddrive and thought that if i selected the original drives as unduplicated and the cloud drive as duplicated and selected duplication for the entire array it would put a copy of every file in the original pool on the cloud drive.  If a drive failed i could easily rebuild from the cloud drive. Here are some settings

That's not what will happen here.   

This would put files that are not duplicated on the local disks.  

But  duplicated data will only be allowed fo be placed on the CloudDrive.  And then since is has no valid target, elsewhere. 

 

You need two+ drives for duplicated data.   Because by "duplicated", we mean any data that is duplicated.  We don't differentiate between original and copy. Both files are valid targets, and handled identically. 

 

The settings you've configured show that you want unduplicated data on the local disks (eg, files with only one copy of the file in the pool), and that you want "both" copies of duplicated data on the CloudDrive disk.

 

 

I'm not sure what you want here, so it's hard to tell what would be best. But ... Assuming I am reading this right, You want to keep one copy local, and one on the CloudDrive?   Or have some data duplicated to the CloudDrive data, and all of the unduplicated data stored local.
If so, then remove the CloudDrive disk from the pool. 
Then create a new pool. 
In this new pool, add the existing pool and the CloudDrive disk to this pool  (is should have only two "disks" in it.

Then in the Balancing settings for this "top level" pool, uncheck ONLY the "Unduplicated" option for the CloudDrive disk. And save. 
This should place all of the duplicated data on both the disks, but only place unduplicated data on the local disk. 

 

if this isn't what you want to do, please explain what exactly you're trying to do.

Link to comment
Share on other sites

  • 0

Apologies for being so confusing.  I have a 40 tb pool that i have used for a long time, but I now want to duplicate the data to a 40 TB google cloud drive.  After the initial post, I realized some of my errors by reading through some of your other older responses to other users.  So what i need is for the local pool to be prioritized for writes and reads, but to be duplicated to the cloud drive for restore if a drive fails.  I have set it up after reversing my old mistake and I wonder if there is anything i may have still gotten wrong.  From what you have said i do need to turn off unduplicated content for the clouddrive which may explain why the drive thinks its 80 tb instead of 40.    Even after making these changes though my two big concerns are that it still shows 80 tb as size of drive (instead of the 40 tb that are really available) and also writes still seem to initially go to the cloud drive.  I have gigabit internet so thats not that big of a deal but any advice on my setup would be very welcome.

 

 

 

image.png.39b737cabdca9ef48a2b893491ad433d.png

image.thumb.png.88eb0dcca2405f78832dbeef195ac0c2.png

image.thumb.png.58fb01b219f14b56e201de2b57b5fe04.png

 

Link to comment
Share on other sites

  • 0

Okay, so, you do want to do what I've posted above.

19 hours ago, mcrommert said:

So what i need is for the local pool to be prioritized for writes and reads, but to be duplicated to the cloud drive for restore if a drive fails

Actually, the RC version of StableBit Drivepool will automatically prefer local disks (and "local" pools) over "remote" drives for reads.  So nothing you need to do here.

As for writes, if real time duplication is enabled, there isn't anything we can really do here.  In this case, both copies are written out at the same time.  So, to the local pool and the CloudDrive disk, in this case.   

But the writes happen to the cache, and then are uploaded.  There are some optimizations there to help prevent inefficient uploads. 

19 hours ago, mcrommert said:

From what you have said i do need to turn off unduplicated content for the clouddrive which may explain why the drive thinks its 80 tb instead of 40.

No, this is to make sure that unduplicated data is kept local instead of remote.

as for the drives size, it will ALWAYS report the raw capacity, no matter the duplication settings.  So this pool WILL report 80TB.   We don't do any processing of the drive size, because .... there is no good way to do this.  Since each folder can have different duplication levels (off, x2, x3 ,x5, x10, etc), and the contents may drastically vary in size, there is no good way to compute this, other than showing the raw capacity. 

20 hours ago, mcrommert said:

also writes still seem to initially go to the cloud drive

There isn't a (good) way to fix this.  

You could turn off real time duplication, which would do this. But it means that new data wouldn't be protected for up to 24 hours, .... or more.  
Also, files that are in use cannot be duplicated, in this config.   

So, it leaves your data more vulnerable, which is why we recommend against it. 

 

The other option is to add a couple of small drives and use the "SSD Optimizer" balancer plugin.  You would need a number of drives equal to the highest duplication level, and the drives don't need to be SSDs.   New files would be written to the drives marked as "SSD"s, and then moved to the other drives later (depending on the balancing settings for that pool). 

Link to comment
Share on other sites

  • 0

Thanks for the info - also realizing while it isn't showing the writes going to the pool when they first happen on the new pool...if i go to the old pool it is showing the writes there

 

With the ssd option...what is meant by "need a number of drives equal to the highest duplication level"?  Do you mean i would need 40 tb of ssd's or ssd designated drives?  Sorry I'm missing this.

 

Thanks for the help...seems to be working very well now and noticing no issues at all in reads and writes in terms of speed

 

And i have moved to the RC release - only complaint is inability to see disk and pool perf at the same time

Link to comment
Share on other sites

  • 0

If you haven duplication enabled, and no "higher" levels, then it's "x2", so you'd need 2x SSDs.  If you have folders set to x4 duplication, then you'd need 4x SSDs, ideally. 

 

 

as for the UI, yeah, that's been a common complaint.  It's been flagged already, so maybe in the next release. 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...