I think those are fine concerns. One thing that Alex and Christopher has said before is that 1) Covecube isn't in any danger of shutting down any time soon and 2) if it would, they would release a tool to convert the chunks on your cloud storage back to native files. So as long as you had access to retrieve the individual chunks from your storage, you'd be able to convert it. But, ultimately, there aren't any guarantees in life. It's just a risk we take by relying on cloud storage solutions.
Christopher, im actually fairly confident that there is a problem here.
I have been troubleshooting this, also in correspondance to the other thread I have posted in, and whenever writes are going to the clouddrive cache it is slow as hell - Recently I picked up some new Samsung 960Pro nvme drives, thought it would be fun to test them out so I put 2 of them together in a raid0 and used them strictly for cache for the clouddrive, that should eliminate any kind of performance issues as those drives are basically able to write in the GB\s range - But im still seeing subpar performance as in writes to the clouddrive cache is below 10MB\s with it being fully maxed out according to windows.
Im not sure if it is something to do with the settings or a software problem, I can simply let you know that the slowdowns are really odd, to the point that faster drives provide no faster throughput on the cache drive, the two 960Pro drives provided me with ~1MB\s more than my single Kingston SSD I have been using in the past.
It has come to the point where I have simply uninstalled the clouddrive software, and now use rclone to move my data to google instead as that is about 15 times faster.
In this post I'm going to talk about the new large storage chunk support in StableBit CloudDrive 220.127.116.111 BETA, why that's important, and how StableBit CloudDrive manages provider I/O overall.
Want to Download it?
Currently, StableBit CloudDrive 18.104.22.1681 BETA is an internal development build and like any of our internal builds, if you'd like, you can download it (or most likely a newer build) here:
The I/O Manager
Before I start talking about chunks and what the current change actually means, let's talk a bit about how StableBit CloudDrive handles provider I/O. Well, first let's define what provider I/O actually is. Provider I/O is the combination of all of the read and write request (or download and upload requests) that are serviced by your provider of choice. For example, if your cloud drive is storing data in Amazon S3, provider I/O consists of all of the download and upload requests from and to Amazon S3.
Now it's important to differentiate provider I/O from cloud drive I/O because provider I/O is not really the same thing as cloud drive I/O. That's because all I/O to and from the drive itself is performed directly in the kernel by our cloud drive driver (cloudfs_disk.sys). But as a result of some cloud drive I/O, provider I/O can be generated. For example, this happens when there is an incoming read request to the drive for some data that is not stored in the local cache. In this case, the kernel driver cooperatively coordinates with the StableBit CloudDrive system service in order generate provider I/O and to complete the incoming read request in a timely manner.
All provider I/O is serviced by the I/O Manager, which lives in the StableBit CloudDrive system service.
Particularly, the I/O Manager is responsible for:
As an optimization, coalescing incoming provider read and write I/O requests into larger requests.
Parallelizing all provider read and write I/O requests using multiple threads.
Retrying failed provider I/O operations.
Error handling and error reporting logic.
Now that I've described a little bit about the I/O manager in StableBit CloudDrive, let's talk chunks. StableBit CloudDrive doesn't inherently work with any types of chunks. They are simply the format in which data is stored by your provider of choice. They are an implementation that exists completely outside of the I/O manager, and provide some convenient functions that all chunked providers can use.
How do Chunks Work?
When a chunked cloud provider is first initialized, it is asked about its capabilities, such as whether it can perform partial reads, whether the remote server is performing proper multi-threaded transactional synchronization, etc... In other words, the chunking system needs to know how advanced the provider is, and based on those capabilities it will construct a custom chunk I/O processing pipeline for that particular provider.
The chunk I/O pipeline provides automatic services for the provider such as:
Whole and partial caching of chunks for performance reasons.
Performing checksum insertion on write, and checksum verification on read.
Read or write (or both) transactional locking for cloud providers that require it (for example, never try to read chunk 458 when chunk 458 is being written to).
Translation of I/O that would end up being a partial chunk read / write request into a whole chunk read / write request for providers that require this. This is actually very complicated.
If a partial chunk needs to be read, and the provider doesn't support partial reads, the whole chunk is read (and possibly cached) and only the part needed is returned.
If a partial chunk needs to be written, and the provider doesn't support partial writes, then the whole chunk is downloaded (or retrieved from the cache), only the part that needs to be written to is updated, and the whole chunk is written back.If while this is happening another partial write request comes in for the same chunk (in parallel, on a different thread), and we're still in the process of reading that whole chunk, then coalesce the [whole read -> partial write -> whole write] into [whole read -> multiple partial writes -> whole write]. This is purely done as an optimization and is also very complicated.
And in the future the chunk I/O processing pipeline can be extended to support other services as the need arises.
Large Chunk Support
Speaking of extending the chunk I/O pipeline, that's exactly what happened recently with the addition of large chunk support (> 1 MB) for most cloud providers.
Previously, most cloud providers were limited to a maximum chunk size of 1 MB. This limit was in place because:
Cloud drive read I/O requests, which can't be satisfied by the local cache, would generate provider read I/O that needed to be satisfied fairly quickly. For providers that didn't support partial reads, this meant that the entire chunk needed to be downloaded at all times, no matter how much data was being read.
Additionally, if checksumming was enabled (which would be typical), then by necessity, only whole chunks could be read and written.
This had some disadvantages, mostly for users with fast broadband connections:
Writing a lot of data to the provider would generate a lot of upload requests very quickly (one request per 1 MB uploaded). This wasn't optimal because each request would add some overhead.
Generating a lot of upload requests very quickly was also an issue for some cloud providers that were limiting their users based on the number of requests per second, rather than the total bandwidth used. Using smaller chunks with a fast broadband connection and a lot of threads would generate a lot of requests per second.
Now, with large chunk support (up to 100 MB per chunk in most cases), we don't have those disadvantages.
What was changed to allow for large chunks?
In order to support the new large chunks a provider has to support partial reads. That's because it's still very necessary to ensure that all cloud drive read I/O is serviced quickly.
Support for a new block based checksumming algorithm was introduced into the chunk I/O pipeline. With this new algorithm it's no longer necessary to read or write whole chunks in order to get checksumming support. This was crucial because it is very important to verify that your data in the cloud is not corrupted, and turning off checksumming wasn't a very good option.
Are there any disadvantages to large chunks?
If you're concerned about using the least possible amount of bandwidth (as opposed to using less provider calls per second), it may be advantageous to use smaller chunks. If you know for sure that you will be storing relatively small files (1-2 MB per file or less) and you will only be updating a few files at a time, there may be less overhead when you use smaller chunks.
For providers that don't support partial writes (most cloud providers), larger chunks are more optimal if most of your files are > 1 MB, or if you will be updating a lot of smaller files all at the same time.
As far as checksumming, the new algorithm completely supersedes the old one and is enabled by default on all new cloud drives. It really has no disadvantages over the old algorithm. Older cloud drives will continue to use the old checksumming algorithm, which really only has the disadvantage of not supporting large chunks.
Which providers currently support large chunks?
I don't want to post a list here because it would get obsolete as we add more providers.
When you create a new cloud drive, look under Advanced Settings. If you see a storage chunk size > 1 MB available, then that provider supports large chunks.
Going Chunk-less in the Future?
I should mention that StableBit CloudDrive doesn't actually require chunks. If it at all makes sense for a particular provider, it really doesn't have to store its data in chunks. For example, it's entirely possible that StableBit CloudDrive will have a provider that stores its data in a VHDX file. Sure, why not. For that matter, StableBit CloudDrive providers don't even need to store their data in files at all. I can imagine writing providers that can store their data on an email server or a NNTP server (a bit of a stretch, yes, but theoretically possible).
In fact, the only thing that StableBit CloudDrive really needs is some system that can save data and later retrieve it (in a timely manner). In that sense, StableBit CloudDrive can be a general purpose drive virtualization solution.
Hello guys, I found myself in pretty tough situation. When I first heard about StableBit CloudDrive and unlimited GDrive, I wanted to set-up my "Unlimited Plex" as fast as possible. This resulted in creating a cloud drive with half-assed settings, like 10 MB chunk size and so on. Now that I know this is not optimal and it's too late to change chunk size, I need to create another CloudDrive with better settings and transfer everything to it.
Downloading it and uploading using local PC will take forever and I want to avoid that. I was thinking about using Google Cloud Compute, so my PC won't have to be turned on and more importantly -> there won't be a problem with uploading 6,5 TB again which, on my current connection, will take roughly 2 weeks. That's assuming it will go smoothly without any problems.
Is there any guide how to do this? I don't mind if it's reasonably paid option also. Any help?
I've had this issue for as long as I've been using CloudDrive.
When I navigate the drive, most of the time Explorer will hang (not responding) when I click into a folder.
When it doesn't hang, sometimes it will take ~2 minutes to load the folder.
The folders don't have many files in them. I have the pinning options enabled.
I thought it might be a bug because the drive was created with an early version of CD, so I created a new drive to see if the issue was gone but it's still happening.
Is there anything I can do? Really annoying having to restart explorer almost every time I want to browse the drive.
So CloudDrive creates a real filesystem on a real (though not physical) drive structure. That means that NTFS on your CloudDrive will behave just like NTFS on a physical hard drive. So, just like a physical drive, when you delete data, NTFS simply marks that data as deleted and the drive space as available for future use. The data remains on the drive structure until overwritten by something else. So, to directly answer your questions:
1) Sure. It will "go away" once it is overwritten by new data. If some sort of information security is important to you (beyond that provided by end to end drive encryption) you'd want to use one of the many tools available to overwrite hard drive data with zeros or random binary.
2) Yes. It can. Just like any physical drive, you can use recovery tools to recover "deleted" or "lost" data off of your mounted CloudDrive. I think, on the balance, this is a huge plus for CloudDrive as a storage solution.
3) You've already reclaimed the space. At least as far as the operating system and filesystem are concerned. Windows will freely write to any drive space that NTFS has marked as available.
What's probably confusing you a little is that unlike a physical drive, where all of the sectors and data space are available from the day you purchase the drive by virtue of the fact that they are stored on a literal, physical, platter; CloudDrive only uploads the blocks once something has written to them at least the first time. This is default behavior for all online storage providers for fairly obvious reasons. You wouldn't want to have to upload, say, an entire 256TB drive structure to Google Drive BEFORE you could start using it.
Nevertheless, when you created your CloudDrive the software DID actually create the full filesystem and make it accessible to your OS. So your OS will treat it as if all of that space already exists--even if it only exists conceptually until CloudDrive uploads the data.
If you used a local disk space provider to create a drive, btw, you would see that it creates all of the blocks at drive creation--since local storage doesn't have the same concerns as online providers.