Jump to content
  • 0

Cloud Drive Using up my Local Storage


Carlo

Question

I'm presently using CloudDrive with ACD. I have the Cloud Drive as Drive D currently set up as a 50TB drive and have added this to my DrivePool (F:)

I set drive D up to hold DUPLICATE files ONLY and set all other drives for NON-Duplicate files.

 

So in a nutshell the cloud drive will be the only place to receive the duplicate data.

All so far is fine.

 

HOWEVER, when I create a cloud drive it asks for the local cache size and I choose NONE but I'm still forced to choose a local drive letter.

My first go-round I choose my local 1TB SSD drive with 640GB still free on it thinking it would be used for a file or two during transfers which I'm ok with.

However, an hour or two into testing my C drive (SSD) is full and looking at the drive I find a new hidden folder starting with "CloudPart." and it has 640GB of files.

 

WHY?  I choose NONE for local caching so I'd think it should not use any space but directly transfer to the cloud.  I can understand if I choose a large local cache that it would use this space but for me I'd prefer it not use any local space beyond reason for working purposes but I certainly don't want it to try and duplicate my data locally first before uploading as that defeats the purpose and will absorb my "free space" better served for local media.

 

I killed the previous Cloud drive and created a new one this time pointing the local "cache" (even though, set to NONE) to a HDD spindle drive.  This to seems to be doing the same thing.  It's just growing and growing.  This is/was a brand new 4TB drive that I added to DrivePool for more storage, so I'm curious to see how much it will use if not all of it before throwing errors that the drive is full (what happened before).

 

Regardless, it is using up space that I'd rather have dedicated to local storage (IE DrivePool F:) then to act as a cache for cloud upload.

 

Am I missing something in my setup or is this a bug?

 

Carlo

 

PS It's going to try and duplicate around 25TB. 

 

Link to comment
Share on other sites

12 answers to this question

Recommended Posts

  • 1

The local cache uses "sparse files".  These files can take up a very large space, or none at all, and they'll still report the same size.

This is how part of how we keep track of what chunks are used where, etc.

 

And if you right click on the file/folder, and check properties, you'll notice that it will report the size and size on disk, and these should be very different values. In fact, the "size on disk" should be closer to the local cache size.

 

However, the cache size isn't a hard limit for what can be stored on the disk.  We grow past that in the case that the cache is filled up completely.  

 

And if the cache is 'overfilled', as it uploads the data, it will reduce the size down until it hits the specified cache size. 

 

 

 

As for the deadlock issue, we are looking into this issue, as it is definitely a serious one. However, it is a very complicated issue, so we don't have a quick fix (we want to fix it properly, rather than half-assing it).

Link to comment
Share on other sites

  • 0

I believe this is by design. I've flagged Alex for verification.

 

 

However, I believe this is what is going on here:

To ensure data reliability, the files are cached locally, until we are able to upload them (and verify the upload, if the setting enabled).  

Since the internet/data connection may be be reliable or fast enough to fully handling the stream.

 

 

Also, having a cache may be a good idea, even if it's relatively small. This allows us to "pin" certain data (namely file system data, and optionally, directory data), so that we're not constantly downloading from the Cloud Provider.

Edited by Christopher (Drashna)
https://stablebit.com/Admin/IssueAnalysis/17682
Link to comment
Share on other sites

  • 0

Yes, but what I'm finding is that it's completely filling up whatever drive I assign to the cloud drive since my pool is over 25TB but any single drive is at most 6TB in size.

In my last case I assigned an empty 4TB drive I just setup for DrivePool use since I needed more local storage.

The CloudDrive "sync" feature then started building it's local cache of the 25TB of data.  It of course filled up the 4TB drive.  After it fills up I get a slew of errors/retrys and of course this 4TB of storage is removed (in practical terms) from my pool use since it has no empty storage left.

 

Considering that during setup I select NONE as the cache size this seems to be a bug from all logical account of thinking.

Otherwise what is the cache setting used for?

 

I've set up limits for each drive and all my drives are set to hold only the non-duplicated content ONLY except for the cloud drive which is set to duplicate only. While I know the CloudPool is a seperate product, but it should honor the setting used by DrivePool since the CloudDrive has in fact been added to the DrivePool in order to duplicate files.

 

Many of us are going to use the cloud drive as nothing more than a "backup" or put another way a repository for an off-site set of duplicated files.

But if this doesn't honor the limits we put on drives then it's a game changer and we might be best off just using xcopy or similar where we retain complete control.

 

I personally don't care if it wants to use "some" local storage space, but it shouldn't just take over the local drive and suck up every bit of storage space (then start to error out).

 

Does this make sense?

 

Carlo

Link to comment
Share on other sites

  • 0

Carlo,

 

To clarify, StableBit CloudDrive does create a cache, and we do need it for certain cases. Like yours.

When it's set to "None" (0 bytes), it means that it actively tries to flush the cache (aka upload it to the Cloud Provider and then delete it locally).  

 

However, depending on your upload speed, that may not be possible in realtime... especially if the connection errors out (common for Amazon Cloud Drive) or if it's being throttled (most, but especially OneDrive).  

In this case, we really have only two options. Provide an IO error and terminate the transfer.... or use a local cache so that we can upload the data as needed without adversely affecting "user space". 

 

We opted for the section option, and that's why you're seeing what you are. 

 

 

As for sucking up space, I'll flag alex about this. Specifically, see if we can add an absolute max size, that will ... well, cause errors when it's completely full, so that it doesn't take up all of your space.

 

 

 

 

But the problem here is the amount of data being added at one time. It has to be put somewhere. And somewhere more permanent than "in memory". 

Link to comment
Share on other sites

  • 0

I here what your saying Chris but don't fully agree.

I do understand that you need some "working space" to store a few blocks to have ready for the thread(s) doing the uploading.

 

However from a high level view it appears a background process is creating all these blocks getting them ready for the upload and just does it things without communicating with the uploading thread(s).  It appears this process has no concept of how far ahead of the upload it is or not and just continues until it has created all the blocks needed for the duplicate section or it runs out of space (which ever comes first).

 

In reality it only needs to stay X blocks ahead of the upload threads.  Weather you have 5 blocks ready or 5,000 makes no difference as you can only upload so fast.  Anything more than this is "wasted".  This shouldn't be hard to calculate since you know at any time the max upload threads available.

 

But with all that said I just want to clarify I'm not against a reasonble working storage area.  But it doesn't need to be 4TB which will take a few days at best to upload.  No way, shape or form should it need to be that far ahead. :)

 

Carlo

 

PS I know this is BETA and fully expect issues, so no issues in that regard.  Just hoping to give feedback along the way to help you guys deliver the best product possible as it has so much promise especially combined with DrivePool.

Link to comment
Share on other sites

  • 0

I have similar problems.

 

1. I created a disk on the local disk with cache size: None. But I still have to choose a cache dir.

The data junks reside in the "StableBit CloudDrive Data ([...])" folder on one hdd, and the cache in "CloudPart.[...]".

There are no files on the new disk.

The data folder is 10GB (the size of the disk), which I understand must be preallocated

BUT the cache dir is also 10GB. Why is there a cache at all for a local disk.

 

2. Dropbox 2GB disk, cache dir is 1gb according to settings. but cache is 2gb on hdd.

Link to comment
Share on other sites

  • 0

I also had a similar(or same issue).

 

Attempted to Copya 1.3 GB. file out of an encrypted truecrypt contained into a 10TB Amazon Cloud Drive. Cache was set to 1GB on my primary SSD partition. Stabilbit Cloud Drive rapidly burned through every last byte of free space on the SSD (53 GB) then caused an IO deadlock just like the one described in the other thread (couldn't open task manager, system unresponsive, hard reboot required). After a hard reboot system came back up the 53 GB of space Stabilbit Cloud Drive was using became free once again. 

Link to comment
Share on other sites

  • 0

I don't see that a "max size used by the cache" option was added.  I'm running 1.0.0.442 and have I/O Throttled errors because (supposedly) my drive D: is out of space.  The error says:

 

"I/O Throttled

Cloud drive (I:\) - Dropbox is having trouble writing data to the local disk (D:\).

 

This error has occurred 53,567 time.

 

Your local disk (D:\) has run out of disk space.

 

Continuing to get this error can affect the data integrity of your cloud drive."

 

Drive D: is the second partition of the SSD boot drive and is 178GB in size.  My cloud drive is 800 GB on Dropbox with a 5.00 GB cache.

 

I recently added another duplicate to a folder so that a copy would be created on the cloud drive.  Currently duplication is at 87.5%, but going very slow - apparently throttled by my upload speed to Dropbox.  Clouddrive says there is still 14.7 GB to upload and if I look at the properties of the CloudPart folder on D: is corresponds with size on disk 14.7 GB (and size 800 GB).  I've got a couple other folders on the D: drive and all totaled size on disk is currently 54.6 GB (and size 839 GB).

 

So why does Windows Explorer say 3.87 GB free of 178 GB?  Why is Drivepool basically stopped at 87.5% because it (apparently) also thinks drive D: is full?  This situation is also affecting the response of the server and general transfer speed of the shared folders.

Link to comment
Share on other sites

  • 0

Right now, there is only the "cache size". That's what it tries to stay at, and there is no "max" size, except for size of the drive.

This is something we do plan on addressing in the future. But for now, upload cache (new writes) will exceed the cache size until you hit about 5GBs free on the cache drive.  From the sounds of it, that's specifically the issue you're encountering. 

 

You may want to check out this thread:

http://community.covecube.com/index.php?/topic/1610-how-the-stablebit-clouddrive-cache-works/

 

Alex (The Developer) talks about how the cache works specifically, including the limitations and issues you may run into. 

Link to comment
Share on other sites

  • 0

You know, it's funny. I have Cloud working fine. I have a 10TB "drive", and I have it auto adding everything. I've successfully backed up 189GB of data. I would suggest doing small bursts of data. Not everyone has incredibly fast internet. I did have a problem with ACD "ping" spamming my connection, my Firewall was blocking part of the communication.

 

I'll test it once my pool gets a bit larger. 

Link to comment
Share on other sites

  • 0

The weird part was drivepool did not recognize that most of the data had already been cleared out of the cache drive (and uploaded to the cloud).  I finally restarted my server.  It took at least an hour to reboot.  Then Clouddrive was mad - said I did an unsafe shutdown and was having to re-synchronize.  I also had to ask drivepool to re-check duplication a second time to make it happy, but now it's acting as it should with no out of space errors.

 

The one thing I find odd is the time drivepool takes to make an additional copy for the cloud.  There a basically open SSD drive waiting to cache the data and it takes hours to make a copy and provide it for cloud.  It sounds like it should be fast based on the link above discussing how the cache works.  Any thoughts on why it is slow?

 

In case it matters here is what I am doing:

1. 3 local hard drives (4TB, 4TB, 1.5TB)

2. 1 cloud drive (800GB)

3. Using folder duplication.

4. Most folders have 2 copies on local drives and some only 1.

5. Pictures folder had 3 copies - all on local drives and completed before I added the cloud drive.

6. Once I added the cloud drive I went into drivepool / pool options / balancing / file placement and deselected the cloud drive for all folders except the Pictures folder.

7. Then started changing folders under Pictures to have 4 copies - the cloud drive was the only option to make the 4th copy.

8.  I did not do the entire folder (~275GB) at once - a few folders at a time.

 

Can you tell I'm worried about losing any photos? ;-)

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...