Prefetch block size too small

andy012345 · September 12, 2016

Is there any way we can customize this?

With google drive for work pre-fetch is constantly fetching 1mb blocks on it's request, causing large API hits and limiting throughput to the drive.

On a 1gbps line pre-fetch can achieve about 60-70mbps if nothing else is happening at the time. If it's uploading/verifying/etc it's much slower. This will vary depending on latency of course, as it makes the request time disproportionately large compared to download time.

I'm currently testing out 4k content streaming and it's just not possible with the throughput offered at the moment.

Could an option be added to let it be <= block size? This would let us scale prefetched data on fast lines much better (400mb prefetch on 20 threads at 20MB block size would be crazy fast).

Christopher (Drashna) · September 21, 2016

The block size should be okay.

But yes, you can customize this. Click on "Disk Options->Performance -> I/O Performance". And it should be there, under the advanced settings.

The Prefetch trigger is how much data is read before it will start prefetching. The block size is how much it will download, and the prefetch window is how long it will keep the data (the larger the block size, the longer you want this).

andy012345 · September 21, 2016

The block size should be okay.

But yes, you can customize this. Click on "Disk Options->Performance -> I/O Performance". And it should be there, under the advanced settings.

The Prefetch trigger is how much data is read before it will start prefetching. The block size is how much it will download, and the prefetch window is how long it will keep the data (the larger the block size, the longer you want this).

All I see is trigger, forward and time window, no option to change block size. There is no way to change the block size, it is always 1mb.

http://i.imgur.com/wEDWKw1.png

Christopher (Drashna) · September 22, 2016

Ah, the partial read (block) size.

The Prefetching option has a "read ahead" size, that determines how much to grab.

However, the drive is broken up into discrete units, specified when the drive is created. Unfortunately, the 1MB chunk size cannot be changed once the drive is created. The only way to change that is to create a new drive.

andy012345 · September 25, 2016

Hi Christopher,

The block size is 20MB. My problem is prefetch only makes 1MB partial requests which is slow and frustrating. If prefetch ever spawns enough connections to handle a 4k video, it gets rate limited by google.

Thanks

thnz · September 26, 2016

A logical optimization for the pre-fetcher would be if it were to combine consecutive reads into a single download thread. For instance in that screenshot:

https://i.imgur.com/3nO6yHC.png

The 8 highlighted requests should be combined into a single 8mb download, rather than 8x1mb downloads as its all consecutive reads of the same chunk. Perhaps the pre-fetcher should wait a second or so to see how far ahead it wants to read, and then firing off a single consolidated download request.

Christopher (Drashna) · September 27, 2016

Hi Christopher,

The block size is 20MB. My problem is prefetch only makes 1MB partial requests which is slow and frustrating. If prefetch ever spawns enough connections to handle a 4k video, it gets rate limited by google.

Thanks

A logical optimization for the pre-fetcher would be if it were to combine consecutive reads into a single download thread. For instance in that screenshot:

https://i.imgur.com/3nO6yHC.png

The 8 highlighted requests should be combined into a single 8mb download, rather than 8x1mb downloads as its all consecutive reads of the same chunk. Perhaps the pre-fetcher should wait a second or so to see how far ahead it wants to read, and then firing off a single consolidated download request.

In theory, the software should be doing that, as of the later builds.

if you're not seeing that, then grab logs and let me know.

http://wiki.covecube.com/StableBit_CloudDrive_Drive_Tracing

andy012345 · September 27, 2016

Logs: https://drive.google.com/file/d/0B4pAdnk2XEMmYzhLeDlKdmdGUnM/view?usp=sharing

thnz · September 27, 2016

Have uploaded logs to that dropbox link - wasn't able to ref this thread when I uploaded, but hopefully it ends up where it needs to go.

I watched the pre-fetcher get maybe ~50mb of data in consecutive 1mb reads - ideally it should have been consolidated into 5ish 10mbish reads. The drive was created with default settings (10mb chunk size).

steffenmand · September 27, 2016

Have uploaded logs to that dropbox link - wasn't able to ref this thread when I uploaded, but hopefully it ends up where it needs to go.

I watched the pre-fetcher get maybe ~50mb of data in consecutive 1mb reads - ideally it should have been consolidated into 5ish 10mbish reads. The drive was created with default settings (10mb chunk size).

did you also change the minimum download size? default = 1MB

thnz · September 28, 2016

did you also change the minimum download size? default = 1MB

No, its still default. Increasing that would just make smaller reads more inefficient. If the pre-fetcher would consolidate consecutive reads into a single download thread, then we could have the best of both worlds. This is possibly more of an issue with a low download thread count - for instance with a 2 thread limit only 2x1MB would be able to be downloaded at once, whereas optimally it would be 2x10MB.

It looks like its already supposed to do something similar, though it might either be bugged or need tweaking. From the changelog:

.598
* [D] The kernel prefetcher is now able to perform special aligned (long range) prefetch requests. These can be as large as 100 MB per request, and 
      multiple requests are permitted. These special long range requests are only allowed if the entire prefetch range can perform prefetching. 
      If the long range prefetch check fails, then a standard more granular prefetch takes place (using 512 KB blocks). The more granular prefetch 
      is now optimized to combine multiple smaller requests into one or more larger requests, if possible, in order to reduce overhead.
      - Aligned long range requests are possible under these circumstances:
        - If legacy chunk verification is used, alignment is set to the chunk size.
        - If block based chunk verification is used, alignment is set to 1 MB.
        - If minimum read size is specified, then alignment is set to the minimum read size or to the above, whichever is greater (use this with 
          caution as it can reduce the effectiveness of multi-threading while downloading, depending on your settings).

Christopher (Drashna) · September 28, 2016

Yup, it *should* be doing so.

If it's not, grab logs, and let us know.

thnz · September 28, 2016

Yup, it *should* be doing so.

If it's not, grab logs, and let us know.

Submitted logs via the dropbox link of it not working as expected yesterday. OP also linked logs in post #8, Hopefully its an easy thing to fix/tweak.

andy012345 · October 1, 2016

Any updates to this? The drive is great but this bug really slows down reads.

Christopher (Drashna) · October 1, 2016

Submitted logs via the dropbox link of it not working as expected yesterday. OP also linked logs in post #8, Hopefully its an easy thing to fix/tweak.

By chance, is the cluster size 64kb on the drive? (ReFS formatted, or later than 16TB drive)?

Also, what version is being used specifically?

thnz · October 1, 2016

By chance, is the cluster size 64kb on the drive? (ReFS formatted, or later than 16TB drive)?

Also, what version is being used specifically?

I captured those logs on a 2GB NTFS drive (Google drive, though the issue isn't provider specific as it happens on Dropbox and Amazon drives too) created with default settings (10MB chunks, 4KB sector size, 20MB cache chunk, no minimum download size, ntfs, windows default cluster size) to demonstrate the issue in the latest beta (.722) - the cluster size was set to 'Windows Default', so 4KB IIRC(?). The drive was empty apart from a single 800mb video file that I was using to demonstrate this issue - after copying the file across I cleared cache and restarted the machine, then simply played the video with drive tracing enabled to trigger it. Pre-fetch settings were default. The 'technical details' window would show lots of individual 1MB pre-fetch reads that weren't consolidated into single download threads, even though they were happening simultaneously and on consecutive parts of the same chunk.

andy012345 · October 6, 2016

It isn't just prefetch that is a problem, it's simple reads too. Reads are limited to 1mb, for example if I turn pre-fetch off my 1gbps line can pull from clouddrive at ~1.5mbps.

The prefetch was 2mb in 463, which suggests something changed it to 1mb. But 2mb is still rediculously low for fast lines.

I appreciate people likely didn't pick up on this in the past because the bandwidth on a 1080p stream wasn't so high but once you go past 1080p clouddrive just can't give the throughput to handle anymore. I also think this is the reason a lot of people report getting rate limited by google.

thnz · October 6, 2016

I'm OK with standard reads being 1MB. The pre-fetcher should kick in when anything larger is done anyway - copying large files, playing videos etc. If the pre-fetcher worked as intended it would potentially be a lot faster/efficient - ie a singleish 10MB read instead of 10x1MB reads. And yeah, it'll lower API calls on the provider too, so you'd be much less likely to be throttled.

Chuntzu · November 13, 2016

I'm running into this with prefetch and reads as well I can upload very fast 500-600 mbit but 40-60mbit reads with this prefetch 1mb chunk stuff. I cant seem to find the setting to change this either in the advanced config file, while creating a drive, or in io performance. So what is the solution to chaning prefetch to 20mb chunks? Thank you for your time.

Chuntzu · November 13, 2016

looks like it may be the minimal download setting while creating a drive. I will have to test this out further.

steffenmand · November 14, 2016

You can only do it if you created a drive using one of the newer versions. If the drive was created with < .463 or something like that, it won't be able to increase the prefetch amount

thnz · December 14, 2016

Any updates on this? Optimizing this could have a significant impact on speeds. If we're going to pre-fetch 50MB it'll be much more efficient to do so using 5x10MB downloads rather than 50x1MB ones - its a potential 90% saving in API requests on drives with default settings.

thnz · January 7, 2017

I've found manually increasing IoManager_DefaultMaximumReadAggregation via the config file and then restarting the service helps work around this. I'm not entirely sure what it does as there's only minimal documentation in the changelog regarding it, so I'm not entirely sure what the full consequences in altering it are.

By default its set to 1048576. My observations are that by increasing it to a value >10MB, the pre-fetcher would then download up to the drive's chunk size (10MB) per pre-fetch download request, rather than splitting it into lots of smaller 1MB requests. I guess whether or not this a good thing depends on use case. As a 10MB chunk downloads in less than a second for me (after the delay/throttling response), splitting the 10MB request up into separate threads isn't really necessary, and I'd rather use less API calls.

Sign In

Prefetch block size too small

Question

andy012345

22 answers to this question

Recommended Posts

Christopher (Drashna)

andy012345

Christopher (Drashna)

andy012345

thnz

Christopher (Drashna)

andy012345

thnz

steffenmand

thnz

Christopher (Drashna)

thnz

andy012345

Christopher (Drashna)

thnz

andy012345

thnz

Chuntzu

Chuntzu

steffenmand

thnz

thnz

Join the conversation

Browse

Activity