Large Chunks and the I/O Manager

Alex · December 2, 2015

In this post I'm going to talk about the new large storage chunk support in StableBit CloudDrive 1.0.0.421 BETA, why that's important, and how StableBit CloudDrive manages provider I/O overall.

Want to Download it?

Currently, StableBit CloudDrive 1.0.0.421 BETA is an internal development build and like any of our internal builds, if you'd like, you can download it (or most likely a newer build) here:

http://wiki.covecube.com/Downloads

The I/O Manager

Before I start talking about chunks and what the current change actually means, let's talk a bit about how StableBit CloudDrive handles provider I/O. Well, first let's define what provider I/O actually is. Provider I/O is the combination of all of the read and write request (or download and upload requests) that are serviced by your provider of choice. For example, if your cloud drive is storing data in Amazon S3, provider I/O consists of all of the download and upload requests from and to Amazon S3.

Now it's important to differentiate provider I/O from cloud drive I/O because provider I/O is not really the same thing as cloud drive I/O. That's because all I/O to and from the drive itself is performed directly in the kernel by our cloud drive driver (cloudfs_disk.sys). But as a result of some cloud drive I/O, provider I/O can be generated. For example, this happens when there is an incoming read request to the drive for some data that is not stored in the local cache. In this case, the kernel driver cooperatively coordinates with the StableBit CloudDrive system service in order generate provider I/O and to complete the incoming read request in a timely manner.

All provider I/O is serviced by the I/O Manager, which lives in the StableBit CloudDrive system service.

Particularly, the I/O Manager is responsible for:

As an optimization, coalescing incoming provider read and write I/O requests into larger requests.
Parallelizing all provider read and write I/O requests using multiple threads.
Retrying failed provider I/O operations.
Error handling and error reporting logic.

Chunks

Now that I've described a little bit about the I/O manager in StableBit CloudDrive, let's talk chunks. StableBit CloudDrive doesn't inherently work with any types of chunks. They are simply the format in which data is stored by your provider of choice. They are an implementation that exists completely outside of the I/O manager, and provide some convenient functions that all chunked providers can use.

How do Chunks Work?

When a chunked cloud provider is first initialized, it is asked about its capabilities, such as whether it can perform partial reads, whether the remote server is performing proper multi-threaded transactional synchronization, etc... In other words, the chunking system needs to know how advanced the provider is, and based on those capabilities it will construct a custom chunk I/O processing pipeline for that particular provider.

The chunk I/O pipeline provides automatic services for the provider such as:

Whole and partial caching of chunks for performance reasons.
Performing checksum insertion on write, and checksum verification on read.
Read or write (or both) transactional locking for cloud providers that require it (for example, never try to read chunk 458 when chunk 458 is being written to).
Translation of I/O that would end up being a partial chunk read / write request into a whole chunk read / write request for providers that require this. This is actually very complicated.
- If a partial chunk needs to be read, and the provider doesn't support partial reads, the whole chunk is read (and possibly cached) and only the part needed is returned.
- If a partial chunk needs to be written, and the provider doesn't support partial writes, then the whole chunk is downloaded (or retrieved from the cache), only the part that needs to be written to is updated, and the whole chunk is written back.
  - If while this is happening another partial write request comes in for the same chunk (in parallel, on a different thread), and we're still in the process of reading that whole chunk, then coalesce the [whole read -> partial write -> whole write] into [whole read -> multiple partial writes -> whole write]. This is purely done as an optimization and is also very complicated.
And in the future the chunk I/O processing pipeline can be extended to support other services as the need arises.

Large Chunk Support

Speaking of extending the chunk I/O pipeline, that's exactly what happened recently with the addition of large chunk support (> 1 MB) for most cloud providers.

Previously, most cloud providers were limited to a maximum chunk size of 1 MB. This limit was in place because:

Cloud drive read I/O requests, which can't be satisfied by the local cache, would generate provider read I/O that needed to be satisfied fairly quickly. For providers that didn't support partial reads, this meant that the entire chunk needed to be downloaded at all times, no matter how much data was being read.
Additionally, if checksumming was enabled (which would be typical), then by necessity, only whole chunks could be read and written.

This had some disadvantages, mostly for users with fast broadband connections:

Writing a lot of data to the provider would generate a lot of upload requests very quickly (one request per 1 MB uploaded). This wasn't optimal because each request would add some overhead.
Generating a lot of upload requests very quickly was also an issue for some cloud providers that were limiting their users based on the number of requests per second, rather than the total bandwidth used. Using smaller chunks with a fast broadband connection and a lot of threads would generate a lot of requests per second.

Now, with large chunk support (up to 100 MB per chunk in most cases), we don't have those disadvantages.

What was changed to allow for large chunks?

In order to support the new large chunks a provider has to support partial reads. That's because it's still very necessary to ensure that all cloud drive read I/O is serviced quickly.
Support for a new block based checksumming algorithm was introduced into the chunk I/O pipeline. With this new algorithm it's no longer necessary to read or write whole chunks in order to get checksumming support. This was crucial because it is very important to verify that your data in the cloud is not corrupted, and turning off checksumming wasn't a very good option.

Are there any disadvantages to large chunks?

If you're concerned about using the least possible amount of bandwidth (as opposed to using less provider calls per second), it may be advantageous to use smaller chunks. If you know for sure that you will be storing relatively small files (1-2 MB per file or less) and you will only be updating a few files at a time, there may be less overhead when you use smaller chunks.
For providers that don't support partial writes (most cloud providers), larger chunks are more optimal if most of your files are > 1 MB, or if you will be updating a lot of smaller files all at the same time.
As far as checksumming, the new algorithm completely supersedes the old one and is enabled by default on all new cloud drives. It really has no disadvantages over the old algorithm. Older cloud drives will continue to use the old checksumming algorithm, which really only has the disadvantage of not supporting large chunks.

Which providers currently support large chunks?

I don't want to post a list here because it would get obsolete as we add more providers.
When you create a new cloud drive, look under Advanced Settings. If you see a storage chunk size > 1 MB available, then that provider supports large chunks.

Going Chunk-less in the Future?

I should mention that StableBit CloudDrive doesn't actually require chunks. If it at all makes sense for a particular provider, it really doesn't have to store its data in chunks. For example, it's entirely possible that StableBit CloudDrive will have a provider that stores its data in a VHDX file. Sure, why not. For that matter, StableBit CloudDrive providers don't even need to store their data in files at all. I can imagine writing providers that can store their data on an email server or a NNTP server (a bit of a stretch, yes, but theoretically possible).

In fact, the only thing that StableBit CloudDrive really needs is some system that can save data and later retrieve it (in a timely manner). In that sense, StableBit CloudDrive can be a general purpose drive virtualization solution.

Edited December 2, 2015 by Alex
The potential disadvantages of large chunks should be made clear vs. disadvantages of the large chunk checksumming algorithm

Choruptian · May 19, 2016

I apologize for cluttering this thread but as a newcomer it would be nice to have just a very shorthand explanation with cons/pros of the currently two featured options in the Advanced Settings which is "Storage chunk size" and "Chunk cache size". Everything else as well because I can't seem to find very much information about them either but I've come to understand most of it.

danjames9222 · April 30, 2017

I would also like to know what this means.

What happens if say we set Storage chunk size to 20MB and Chunk Cache Size to 100MB?

What is the effect?

Christopher (Drashna) · May 1, 2017

The "Storage Chunk Size" is the size that we use on the provider. For the most part, changing this doesn't really affect things one way or another, because we generally perform partial reads and don't need the whole chunk.

However, we can grab the entire chunk at once, when it makes sense.

The big thing is that this does affect the number of files are stored on a provider (some have issues).

That said, if you set the "minimum download size", it does affect the largest size that this can be set to.

The "chunk cache size" is the size that we store on the cache. Since this affects the IO throughput for the cache drive, this can affect performance on the drive.

Jonibhoni · April 9

On 12/2/2015 at 3:27 PM, Alex said:

If a partial chunk needs to be written, and the provider doesn't support partial writes, then the whole chunk is downloaded (or retrieved from the cache), only the part that needs to be written to is updated, and the whole chunk is written back.

Wouldn't it be more efficient to just load the parts of the chunk that were not modified by the user, instead of the whole chunk? One could on average save half the downloaded volume, if I'm correct. Egress is expensive with cloud providers. 😅

Think: I modify the first 50 MB of a 100 MB chunk. Why is the whole 100 MB chunk downloaded just to overwrite (= throw away) the first 50 MB after downloading?

Sign In

Large Chunks and the I/O Manager

Recommended Posts

Alex

Link to comment

Share on other sites

Choruptian

Link to comment

Share on other sites

danjames9222

Link to comment

Share on other sites

Christopher (Drashna)

Link to comment

Share on other sites

Jonibhoni

Link to comment

Share on other sites

Join the conversation

Browse

Activity