Jump to content
Covecube Inc.
  • 0

Most Efficient Way to Move Duplicated Data From *Slow* Drive?


Question

Hello,

I have an exceptionally (unacceptably) slow CloudDrive (I know this is the wrong product, not looking for a solution here, bear with me). For the sake of argument, we could also consider this to be an extremely slow drive in general.

I'd like to remove it, but it has a lot of data (3TB+) so it's going to take a *very* long time. Which I can wait but I'd rather not.

Assuming all data on it has been properly duplicated (the All-In-One Plug-in has the drive set to "Duplicated" data only), what is the fastest way to re-duplicate the data to other drives in order to remove the CloudDrive?

Specifically, can I tell DrivePool to use the other (mostly physical) drives to be the primary source for re-duplicating the data since it'll be significantly faster? If the re-duplication/removal process is multi-threaded maybe this is not an issue; it's unclear to me if that's the case.

I'd rather not nuke my options if this goes sour (ie if for some reason data hasn't been duplicated, I can still recover it from the CloudDrive before I destroy it). However, I believe most of my sensitive data is residing on other drives or backed up elsewhere.

Thanks!

PS: Love the products. Actually possibly looking at abandoning UnRAID / FreeNAS / TrueNAS / ZFS in favor of it.

Link to post
Share on other sites

2 answers to this question

Recommended Posts

  • 0
1 hour ago, Xioustic said:

I have an exceptionally (unacceptably) slow CloudDrive (I know this is the wrong product, not looking for a solution here, bear with me). For the sake of argument, we could also consider this to be an extremely slow drive in general.

I'd like to remove it, but it has a lot of data (3TB+) so it's going to take a *very* long time. Which I can wait but I'd rather not.

In theory, cloud based storage seems to be a great idea. In practice, I have only experienced extremely slow upload times and have never considered it a viable option for mass storage or backup, like for my home media server with ~60TB of data sitting in DrivePool. Worse yet, my internet provider throttles back on connections to such cloud based services unless you pay for their extremely high business connection accounts. So the only data I backup to the cloud is relatively small financial and document files that I might want to share across mobile devices.

I expect that any solution to removing your cloud data I might offer may not be fast enough. But there are some options I'll mention just the same.

1 hour ago, Xioustic said:

Assuming all data on it has been properly duplicated (the All-In-One Plug-in has the drive set to "Duplicated" data only), what is the fastest way to re-duplicate the data to other drives in order to remove the CloudDrive?

When you click remove drive in the DrivePool GUI, one of the options is "Duplicate files later (faster drive removal)" which might reduce the amount of files DrivePool will have to offload. If I understand that option, DrivePool will skip over the cloud files that it has a copy of locally. Then, after all non duplicated data is removed from your CloudDrive (slow drive), it will recheck your DrivePool duplication requirements on your folders on your local drives and perform that task in the background.

If you are worried about data loss during the slow removal of your CloudDrive files, I think there is yet another option to consider. Let's say you have a duplicated 2X Movie folder that you want to protect the data as you remove the CloudDrive. Maybe 1X copy is on your local (fast) drives and the other 1X copy is on your cloud based storage = 2X copies. What I would suggest, is that you create a local copy of your 2X data on a separate 1X temp folder in DrivePool - if you have enough room. DrivePool can use Read Striping on 2X duplicated folders to increase read speeds and will write the data to your local drives - assuming you have already told DrivePool to remove the CloudDrive which should block it from any data being written back to that drive.

Copying your 2X folders to a separate 1X folder should be a lot faster than offloading data from the Cloud, and if things went bad with the Cloud offload, then you still have a backup of all your 2X data in your local 1X temp backup folder.

2 hours ago, Xioustic said:

Specifically, can I tell DrivePool to use the other (mostly physical) drives to be the primary source for re-duplicating the data since it'll be significantly faster? If the re-duplication/removal process is multi-threaded maybe this is not an issue; it's unclear to me if that's the case.

Once you have told DrivePool to remove your (slow) CloudDrive, it should block any data from being written back to it. At that point, if you copy your 2X folder to a separate 1X temp folder, it could use Read Striping to make a copy of the data to your local drives using the fastest drives for duplicating the data.

One of the Balancers is the Drive Usage Limiter, which allows you to designate each drive for either duplicated or unduplicated data, neither, or both. I have not yet switched over to the All-In-One Balancer, but I think the concept is similar. That might present you with additional options as you move/duplicate data around.

In any case, your CloudDrive is probably showing up as only 1 drive in DrivePool, and therefore should not have more than 1 copy of your 2X files on your CloudDrive. When you copy data from a 2X folder with Read Striping, DrivePool seems to pull data off the fastest drive possible and uses less data from the slower drive. If your cloud data transfer rate is really slow, then DrivePool will be pulling almost all the data from your local faster drives. 

 

2 hours ago, Xioustic said:

PS: Love the products. Actually possibly looking at abandoning UnRAID / FreeNAS / TrueNAS / ZFS in favor of it.

I have been using DrivePool for less than 1 year, but the more I learn about using it the more I like it. There are many options available with DrivePool that I did not have using hardware RAID controllers or Windows Storage Spaces. The biggest selling point for me was data integrity and recovery in DrivePool. I just had too many problems with Storage Spaces and my life is much better with DrivePool. Whatever software you decide to go with, I hope you find what works best for you.

Link to post
Share on other sites
  • 0

Well, if it's an emergency, and you're okay with not keeping the files on the drive... there is a command line tool that can IMMEDIATELY eject the drive from the pool. 

The caveat here is that it does not move any of the data over, so if it's not duplicated, it will no longer be in the pool.  But you can copy it from the drive, at your leisure, at least. 

It's "dpcmd ignore-poolpart".  You'll need to specify the "pool ID", which is just the name of the hidden "poolpart.xxxx" folder on the drive in question. 

Once you've done this, it will show up as "missing" in the UI, and you can then remove it from the pool.  

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...