First off, I want to say that I'm a trial user of 854, and am deeply impressed by this software; however, there is one major issue standing in the way of my use case.
I have tried 4 different torrent clients, downloading and uploading directly from a CloudDrive. They all work reasonably for a small amount of parallel torrent downloads, with overall download speed decreasing by about 66%. Reasonable overhead, and reasonable needing to limit the maximum simultaneous files being processed.
However, when something does go wrong (inevitably, and frequently if too many torrents are downloading) the clients will want to hash the files, to make sure nothing is corrupted.
This however does not work, or at least not well. Hashing rate is in an order of magnitude slower than if a file is stored locally.
This flaw is so pronounced, that it is quicker to simply redownload a torrent from scratch.
rtorrent (linux based but running in cygwin) performs best, with what seems to be optimizations that allow it to skip large parts of the hashing process.
Still, a ~12gig file downloaded to 50% will take a day to fully hash check; furthermore, you can copy the same file locally and hash check it before the other client checking directly from the clouddrive can get to 5%
This is not a solution however, as I have torrents whose partial download state exceed total local storage capacity.
This shouldn't be the case, in practice, the entirety of the written torrent blocks will have to be downloaded and each checked. However, it's quite clear that the prefetcher is not managing to cache the file before the client needs it, or another problem is inplay.
I suspect, from limited insights, that the clients are not hashing the torrent blocks in sequential block order (fairly sure of this), and are instead skipping around throughout the file, which may be confusing the prefetcher and/or cache.
Furthermore, and this is just a guess, I believe that since torrent blocks may (usually) don't align with the provider chunk size, it may frequently download a chunk, check only one of the contained blocks of a possible 2 or more contained in that chunk (based on both block and chunk size), and then discards it from the cache due to the potentially considerable wait before the torrent client randomly comes back and checks the neighboring blocks stored in any given chunk.
It's also possible that the root of the problem may lie in that the clients expect a sparse filesystem, which (and I'm unclear of the details on this) allows it to know to skip hash checking of blocks that are zeroed (unwritten yet). It's possible that clouddrive doesn't handle this sparse storage, and is actually writing out all zeros to the clouddrive, and also requiring the torrent client to check them. I'm further inclined to believe the allocation of zero space is to blame, as when copying with explorer a file downloaded halfway, the transfer status doesn't count transfer progress down from the size on disk (actual data downloaded), but rather the completed file size (the size it should be if fully present).
Also, the problem could have it's roots in the fact that the torrent client doesn't download a file's blocks sequentially, and may take quite awhile to complete any given download, which (and this is purely a guess) causes the blocks to get mixed in with other downloads and be scattered across many different non-sequential chunks.
All things considered, the prefetcher manages to get cache hits in the range of 60-85%, with about 2Gigs utilized at any given time, and set with a 10mb trigger, 20mb read ahead, and 300sec expiration.
Question
Firerouge
First off, I want to say that I'm a trial user of 854, and am deeply impressed by this software; however, there is one major issue standing in the way of my use case.
I have tried 4 different torrent clients, downloading and uploading directly from a CloudDrive. They all work reasonably for a small amount of parallel torrent downloads, with overall download speed decreasing by about 66%. Reasonable overhead, and reasonable needing to limit the maximum simultaneous files being processed.
However, when something does go wrong (inevitably, and frequently if too many torrents are downloading) the clients will want to hash the files, to make sure nothing is corrupted.
This however does not work, or at least not well. Hashing rate is in an order of magnitude slower than if a file is stored locally.
This flaw is so pronounced, that it is quicker to simply redownload a torrent from scratch.
rtorrent (linux based but running in cygwin) performs best, with what seems to be optimizations that allow it to skip large parts of the hashing process.
Still, a ~12gig file downloaded to 50% will take a day to fully hash check; furthermore, you can copy the same file locally and hash check it before the other client checking directly from the clouddrive can get to 5%
This is not a solution however, as I have torrents whose partial download state exceed total local storage capacity.
This shouldn't be the case, in practice, the entirety of the written torrent blocks will have to be downloaded and each checked. However, it's quite clear that the prefetcher is not managing to cache the file before the client needs it, or another problem is inplay.
I suspect, from limited insights, that the clients are not hashing the torrent blocks in sequential block order (fairly sure of this), and are instead skipping around throughout the file, which may be confusing the prefetcher and/or cache.
Furthermore, and this is just a guess, I believe that since torrent blocks may (usually) don't align with the provider chunk size, it may frequently download a chunk, check only one of the contained blocks of a possible 2 or more contained in that chunk (based on both block and chunk size), and then discards it from the cache due to the potentially considerable wait before the torrent client randomly comes back and checks the neighboring blocks stored in any given chunk.
It's also possible that the root of the problem may lie in that the clients expect a sparse filesystem, which (and I'm unclear of the details on this) allows it to know to skip hash checking of blocks that are zeroed (unwritten yet). It's possible that clouddrive doesn't handle this sparse storage, and is actually writing out all zeros to the clouddrive, and also requiring the torrent client to check them. I'm further inclined to believe the allocation of zero space is to blame, as when copying with explorer a file downloaded halfway, the transfer status doesn't count transfer progress down from the size on disk (actual data downloaded), but rather the completed file size (the size it should be if fully present).
Also, the problem could have it's roots in the fact that the torrent client doesn't download a file's blocks sequentially, and may take quite awhile to complete any given download, which (and this is purely a guess) causes the blocks to get mixed in with other downloads and be scattered across many different non-sequential chunks.
All things considered, the prefetcher manages to get cache hits in the range of 60-85%, with about 2Gigs utilized at any given time, and set with a 10mb trigger, 20mb read ahead, and 300sec expiration.
Link to comment
Share on other sites
23 answers to this question
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.