Jump to content
Covecube Inc.
  • 0
andy012345

Prefetch block size too small

Question

Is there any way we can customize this?

 

With google drive for work pre-fetch is constantly fetching 1mb blocks on it's request, causing large API hits and limiting throughput to the drive.

 

On a 1gbps line pre-fetch can achieve about 60-70mbps if nothing else is happening at the time. If it's uploading/verifying/etc it's much slower. This will vary depending on latency of course, as it makes the request time disproportionately large compared to download time.

 

I'm currently testing out 4k content streaming and it's just not possible with the throughput offered at the moment.

 

Could an option be added to let it be <= block size? This would let us scale prefetched data on fast lines much better (400mb prefetch on 20 threads at 20MB block size would be crazy fast).

Share this post


Link to post
Share on other sites

22 answers to this question

Recommended Posts

  • 0

The block size should be okay. 

 

But yes, you can customize this. Click on "Disk Options->Performance -> I/O Performance".  And it should be there, under the advanced settings. 

 

The Prefetch trigger is how much data is read before it will start prefetching. The block size is how much it will download, and the prefetch window is how long it will keep the data (the larger the block size, the longer you want this).

Share this post


Link to post
Share on other sites
  • 0

The block size should be okay. 

 

But yes, you can customize this. Click on "Disk Options->Performance -> I/O Performance".  And it should be there, under the advanced settings. 

 

The Prefetch trigger is how much data is read before it will start prefetching. The block size is how much it will download, and the prefetch window is how long it will keep the data (the larger the block size, the longer you want this).

 

All I see is trigger, forward and time window, no option to change block size. There is no way to change the block size, it is always 1mb.

 

http://i.imgur.com/wEDWKw1.png

Share this post


Link to post
Share on other sites
  • 0

Ah, the partial read (block) size.  

 

The Prefetching option has a "read ahead" size, that determines how much to grab. 

 

However, the drive is broken up into discrete units, specified when the drive is created. Unfortunately, the 1MB chunk size cannot be changed once the drive is created.  The only way to change that is to create a new drive. 

Share this post


Link to post
Share on other sites
  • 0

Hi Christopher,

 

The block size is 20MB. My problem is prefetch only makes 1MB partial requests which is slow and frustrating. If prefetch ever spawns enough connections to handle a 4k video, it gets rate limited by google.

 

Thanks

Share this post


Link to post
Share on other sites
  • 0

A logical optimization for the pre-fetcher would be if it were to combine consecutive reads into a single download thread. For instance in that screenshot:

 

https://i.imgur.com/3nO6yHC.png

 

The 8 highlighted requests should be combined into a single 8mb download, rather than 8x1mb downloads as its all consecutive reads of the same chunk. Perhaps the pre-fetcher should wait a second or so to see how far ahead it wants to read, and then firing off a single consolidated download request.

Share this post


Link to post
Share on other sites
  • 0

Hi Christopher,

 

The block size is 20MB. My problem is prefetch only makes 1MB partial requests which is slow and frustrating. If prefetch ever spawns enough connections to handle a 4k video, it gets rate limited by google.

 

Thanks

 

 

A logical optimization for the pre-fetcher would be if it were to combine consecutive reads into a single download thread. For instance in that screenshot:

 

https://i.imgur.com/3nO6yHC.png

 

The 8 highlighted requests should be combined into a single 8mb download, rather than 8x1mb downloads as its all consecutive reads of the same chunk. Perhaps the pre-fetcher should wait a second or so to see how far ahead it wants to read, and then firing off a single consolidated download request.

 

 

In theory, the software should be doing that, as of the later builds.  

 

if you're not seeing  that, then grab logs and let me know.

 

http://wiki.covecube.com/StableBit_CloudDrive_Drive_Tracing

Share this post


Link to post
Share on other sites
  • 0

Have uploaded logs to that dropbox link - wasn't able to ref this thread when I uploaded, but hopefully it ends up where it needs to go.

 

I watched the pre-fetcher get maybe ~50mb of data in consecutive 1mb reads - ideally it should have been consolidated into 5ish 10mbish reads. The drive was created with default settings (10mb chunk size).

Share this post


Link to post
Share on other sites
  • 0

Have uploaded logs to that dropbox link - wasn't able to ref this thread when I uploaded, but hopefully it ends up where it needs to go.

 

I watched the pre-fetcher get maybe ~50mb of data in consecutive 1mb reads - ideally it should have been consolidated into 5ish 10mbish reads. The drive was created with default settings (10mb chunk size).

 

did you also change the minimum download size? default = 1MB

Share this post


Link to post
Share on other sites
  • 0

did you also change the minimum download size? default = 1MB

 

No, its still default. Increasing that would just make smaller reads more inefficient. If the pre-fetcher would consolidate consecutive reads into a single download thread, then we could have the best of both worlds. This is possibly more of an issue with a low download thread count - for instance with a 2 thread limit only 2x1MB would be able to be downloaded at once, whereas optimally it would be 2x10MB.

 

It looks like its already supposed to do something similar, though it might either be bugged or need tweaking. From the changelog:

.598
* [D] The kernel prefetcher is now able to perform special aligned (long range) prefetch requests. These can be as large as 100 MB per request, and 
      multiple requests are permitted. These special long range requests are only allowed if the entire prefetch range can perform prefetching. 
      If the long range prefetch check fails, then a standard more granular prefetch takes place (using 512 KB blocks). The more granular prefetch 
      is now optimized to combine multiple smaller requests into one or more larger requests, if possible, in order to reduce overhead.
      - Aligned long range requests are possible under these circumstances:
        - If legacy chunk verification is used, alignment is set to the chunk size.
        - If block based chunk verification is used, alignment is set to 1 MB.
        - If minimum read size is specified, then alignment is set to the minimum read size or to the above, whichever is greater (use this with 
          caution as it can reduce the effectiveness of multi-threading while downloading, depending on your settings).

Share this post


Link to post
Share on other sites
  • 0

Yup, it *should* be doing so. 

 

If it's not, grab logs, and let us know. 

 

Submitted logs via the dropbox link of it not working as expected yesterday. OP also linked logs in post #8, Hopefully its an easy thing to fix/tweak.

Share this post


Link to post
Share on other sites
  • 0

Submitted logs via the dropbox link of it not working as expected yesterday. OP also linked logs in post #8, Hopefully its an easy thing to fix/tweak.

 

 

By chance, is the cluster size 64kb on the drive? (ReFS formatted, or later than 16TB drive)? 

 

 

Also, what version is being used specifically? 

Share this post


Link to post
Share on other sites
  • 0

By chance, is the cluster size 64kb on the drive? (ReFS formatted, or later than 16TB drive)? 

 

 

Also, what version is being used specifically? 

 

I captured those logs on a 2GB NTFS drive (Google drive, though the issue isn't provider specific as it happens on Dropbox and Amazon drives too) created with default settings (10MB chunks, 4KB sector size, 20MB cache chunk, no minimum download size, ntfs, windows default cluster size) to demonstrate the issue in the latest beta (.722) - the cluster size was set to 'Windows Default', so 4KB IIRC(?). The drive was empty apart from a single 800mb video file that I was using to demonstrate this issue - after copying the file across I cleared cache and restarted the machine, then simply played the video with drive tracing enabled to trigger it. Pre-fetch settings were default. The 'technical details' window would show lots of individual 1MB pre-fetch reads that weren't consolidated into single download threads, even though they were happening simultaneously and on consecutive parts of the same chunk.

Share this post


Link to post
Share on other sites
  • 0

It isn't just prefetch that is a problem, it's simple reads too. Reads are limited to 1mb, for example if I turn pre-fetch off my 1gbps line can pull from clouddrive at ~1.5mbps.

 

The prefetch was 2mb in 463, which suggests something changed it to 1mb. But 2mb is still rediculously low for fast lines.

 

I appreciate people likely didn't pick up on this in the past because the bandwidth on a 1080p stream wasn't so high but once you go past 1080p clouddrive just can't give the throughput to handle anymore. I also think this is the reason a lot of people report getting rate limited by google.

Share this post


Link to post
Share on other sites
  • 0

I'm OK with standard reads being 1MB. The pre-fetcher should kick in when anything larger is done anyway - copying large files, playing videos etc. If the pre-fetcher worked as intended it would potentially be a lot faster/efficient - ie a singleish 10MB read instead of 10x1MB reads. And yeah, it'll lower API calls on the provider too, so you'd be much less likely to be throttled.

Share this post


Link to post
Share on other sites
  • 0

I'm running into this with prefetch and reads as well I can upload very fast 500-600 mbit but 40-60mbit reads with this prefetch 1mb chunk stuff. I cant seem to find the setting to change this either in the advanced config file, while creating a drive, or in io performance. So what is the solution to chaning prefetch to 20mb chunks? Thank you for your time.

Share this post


Link to post
Share on other sites
  • 0

Any updates on this? Optimizing this could have a significant impact on speeds. If we're going to pre-fetch 50MB it'll be much more efficient to do so using 5x10MB downloads rather than 50x1MB ones - its a potential 90% saving in API requests on drives with default settings.

Share this post


Link to post
Share on other sites
  • 0

I've found manually increasing IoManager_DefaultMaximumReadAggregation via the config file and then restarting the service helps work around this. I'm not entirely sure what it does as there's only minimal documentation in the changelog regarding it, so I'm not entirely sure what the full consequences in altering it are.

 

By default its set to 1048576. My observations are that by increasing it to a value >10MB, the pre-fetcher would then download up to the drive's chunk size (10MB) per pre-fetch download request, rather than splitting it into lots of smaller 1MB requests. I guess whether or not this a good thing depends on use case. As a 10MB chunk downloads in less than a second for me (after the delay/throttling response), splitting the 10MB request up into separate threads isn't really necessary, and I'd rather use less API calls.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...