Jump to content

Firerouge

Members
  • Posts

    31
  • Joined

  • Days Won

    1

Everything posted by Firerouge

  1. Can this be revaluated. With workspace drives now being limited by google, team drives are the only google option that has no sized based quota. Large chunk sizes could help mitigate the 400k files limit. With 100mb chunks, 40TB team drives may be possible.
  2. With the new hierarchical chunk organization, shouldn't this now be technically possible?
  3. Whoa, that's way slower than I expected. You're seeing only about 5.5TB migrated per day! What sort of system specs, or resource consumption are you seeing, does it seem bottlenecked by anything other than Google's concurrency limit?
  4. It's certainly possible but cloud hosting can be prohibitively expensive if you intend to get a system capable of hardware transcoding (a computational must for more than one or two high resolution streams) along with the bandwidth capacity. Furthermore, you'll probably want to look at providers who has locality to your client streaming locations. It's also important, since you mention using a (sketchy) seebox host, that you don't attempt to download torrents directly into your cloud drive. You will almost certainly fragment the filesystem and nullify the capabilities of the prefetcher. But fundamentally the cloud drive migration is as simple as unmounting from one location and remounting in another.
  5. New beta looks to have major improvements to the migration process, make sure you're on it before reporting any additional bugs or doing something crazy like delete local cache. .1307 * Added detailed logging to the Google Drive migration process that is enabled by default. * Redesigned the Google Drive migration process to be quicker in most cases: - For drives that have not run into the 500,000 files per folder limit, the upgrade will be nearly instantaneous. - Is able to resume from where the old migration left off. * [Issue #28410] Output a descriptive Warning to the log when a storage provider's data organization upgrade fails.
  6. I'm still holding off patiently on the conversion, it sounds like it works, but waiting to get a better idea at the of the time it takes by drive data size. I've noticed that without any changed settings these past few days I've gotten a couple yellow I/O error warnings about user upload rate limit exceeded (which otherwise haven't been problems), and I've noticed gdrive side upload throttling at a lower than normal concurrency, only 4 workers at 70mbit. I'm guessing some of these rate limit errors people may be seeing in converting are transient from gdrive being under high load.
  7. I'm guessing this latest beta changelog is referencing the solution to this .1305 * Added a progress indicator when performing drive upgrades. * [Issue #28394] Implemented a migration process for Google Drive cloud drives to hierarchical chunk organization: - Large drives with > 490,000 chunks will be automatically migrated. - Can be disabled by setting GoogleDrive_UpgradeChunkOrganizationForLargeDrives to false. - Any drive can be migrated by setting GoogleDrive_ForceUpgradeChunkOrganization to true. - The number of concurrent requests to use when migrating can be set with GoogleDrive_ConcurrentRequestCount (defaults to 10). - Migration can be interrupted (e.g. system shutdown) and will resume from where it left off on the next mount. - Once a drive is migrated (or in progress), an older version of StableBit CloudDrive cannot be used to access it. * [Issue #28394] All new Google Drive cloud drives will use hierarchical chunk organization with a limit of no more than 100,000 children per folder. Some questions, seeing as the limit appears to be around 500,000, is there an option to set the new hierarchical chunk organization folder limit to something higher than 100,000? Has anyone performed the migration yet, what is the approximate time it takes to transfer a 500,000 chunk drive to the new format? Seeing as there are concurrency limit options, does the process also entail a large amount of upload or download bandwidth? After migrating, is there any performance difference compared to the prior non hierarchical chunk organization? Edit: if the chunk limit is 500,000, and if chunks are 20Mb, shouldn't this be occurring on all drives over 10Tb in size? Note, I haven't actually experienced this issue and I have a few large drives under my own api key, so it may be a very slow rollout or an A/B test.
  8. It's worth mentioning that in low disk space scenarios, the drive will also stop writing entirely. With about 3GBs of space left on the cache hosting disk (with expandable cache set to minimal) it will entirely disable upload IO. This is independant of upload size, so for example, with 3GBs of space on the cache drive left, you'll still be unable to upload a 700MB file. Upload IO is also significantly slowed in the range of only 4-6GBs of space on the cache hosting drive. This is worth noting, as it can lead to scenarios where you're trying to move files off the cache hosting drive into the cloud drive, but be unable to make more room for the cache.
  9. I actually think I know what I was observing. It would seem that if the cache hosting drive nears (or perhaps hits) it's full capacity, the entirety of the cache appears to get wiped. This is probably intended behavior, so I've simply set the cache to a smaller size, which seems to more or less resolve the issue.
  10. Is there a way to set the cache cleanup/expiration time to be higher or infinite? Essentially, I have a large expandable cache set, but with time the cache shrinks as it automatically removes data, presumedly if it's not accessed soon/often enough. I'd like the cache to remain at it's maximum level until I see fit to clear it myself, or there after no longer any files in the drive to cache. Is this possible? Perhaps with one of the other cache modes besides expandable?
  11. Yes, it's probably only a partial log of it, but I submitted the archive to your dropbox submission link (labelled with this account name).
  12. I caught it happening on 861. As you can see a 2+ minute read operation on one chunk... I've attempted to trace this (I didn't get the start, but It should have recorded this read). Upon completion of the connection, it jumped back up to its normal one to two dozen parallel write operations (most in a variant of the SharedWait state) I'll hopefully be switching to a faster transit VPS shortly, in an effort to disprove network misconfiguration as the cause. I realize also, that this is in part a limitation in the program utilizing the clouddrive, as it seems to wait until all (or most) of the burst of operations complete before starting the next wave, so even a relatively slow 20 second read can have blocking implications on additional writes. However, a fast fix for the worst offenders (multi minute connections) would be quite beneficial.
  13. My, you're right, it is, thank you. Wonder why the update checker never gave me it though.
  14. I too have noticed this is a common user oversight in the current design. If I can make a suggestion, I think the windows 7 screen resolution slider (sadly now gone) is a decent case study of how this can be cleanly implemented, by listing only the extremes and common middle options. Obviously using a slider has limitations to fine granularity, so for users not inclined to max out sliders, the box should still be type-able. I suspect a majority of users would fall into one of these common drive sizes 1, 10, 50, 100, 256, 512GB, or 1, 10, 100, or, 256TB. Probably most heavily dictated by available storage options from each provider.
  15. I just now noticed that setting in the wiki as well, it isn't listed in the default config, I'm going to experiment with some variations of that setting as a solution. As for recording it, I've only ever noticed it twice, and that was just luck of glancing at the technical log at the right time and noticing that it had dropped to one single read request, which upon a detailed look, showed the slow speed and the 1min+ connection timer. I'll try and create logs, but I might not have much luck locking it down. Minor side point while I have your attention, a brand new windows VPS running almost exclusively 854+rtorrent+rclone rarely has unexpected reboots during peak disk i/o. The problem seems to be described in issue 27416, ostensibly fixed a month ago, but in a seemingly unreleased version 858. Can we expect a new RC soon? The issue tracker seems to imply internally you're already past version 859 even.
  16. That's true, but so will a flexible cache, which queues up writes ontop of the existing cache, and if the cache drive itself gets within 6GB of being full it'll throttle. Where as the fixed queue will shrink the existing cache until it's all a write queue, before throttling. My cache is 15GB smaller than the 60GB free SSD it is on, so for a flexible cache I'd only get about 9GBs of queued writes before throttling, where as the fixed queue can dedicate all 45GB of the cache to writing (at the loss of all other cached torrent data), before throttling. Better still, since that initial preallocation write queue has replaced a portion of the cache (where as flexible doesn't necessarily retain any of the recent write queue in cache after uploading) downloads usually are immediately faster, as they'll modify more zeroed chunks straight from local cache.
  17. I should add that the fixed cache type is another setting directly benefiting torrenting. From the CoveCube blog "Overall, the fixed cache is optimized for accessing recently written data over the most frequently accessed data." A new torrent is likely to have the majority of seeding requests, so fixed is the best cache if you're continually downloading new torrents. Plus I prefer the predictable size of the drive cache when performing a large file preallocation.
  18. When a drive is first created, the last advanced setting, cluster size, dictates maximum drive size. Any sizes over 10TB you're required to type into the cloud drive size setting box. If you want the maximum size, simply type 256TB.
  19. Most every read operation finishes so quickly that it's almost impossible to even see the connection speeds for them in the log. Occasionally, maybe one read per 100gigs, I'll get an incredibly slow read operation download. Occasionally taking over a minute to download the 20MB chunk (longest I've seen was a minute 50), with speeds around 200-500kb/s. These slow reads tend to block other operations for the program I'm using. This is pretty bad. To try and circumvent this. I edited the IoManager_ReadAbort line in advanced settings, down from 1:55, to :30 seconds. However, this command doesn't work as expected, instead of aborting the read and retrying, if a connection exceeds this timeframe, it actually disconnects the drive (unmounts it), and presents the retry and reauthorize options in clouddrive UI. Retry will always reconnect it right away, but this doesn't solve the errant slow connection issue. I believe the IoManger_ReadAbort would be better suited if it actually just reattempted the read connection on a timeout, instead of assuming a full provider failure. With that in mind, I propose that if IoManager_ReadAbort is triggered it should utilize the IoManager_ReadRetries variable to attempt a specified number of reconnects. Alternatively, a new flag, IoManager_MinSustainedReadSpeed (defined in kb/s) could be implemented, to specifically retry connections with very slow read speeds, which would likely detect and rectify these connections quicker than waiting for a timeout period before retrying.
  20. And I agree, it wasn't a controlled test by any means, other tools were using the drive at the time I tried to defrag it. I haven't given it a second attempt. I needed to create a new disk anyway, and preallocate eliminates the fragmentation problem. Similar point with minimum download, my initial drive configuration had a 1MB min, my new one uses 5, which hopefully should perform better (fewer API requests as well). Hopefully final builds better guide users on setting these, or ideally configures more dynamically by need. Speaking of which, the any other tips from the advanced config? LocalIo_ReleaseHandlesDelayMS particularly looks interesting.
  21. Don't disable it for hashing, watching the technical log shows that (atleast rtorrent) hashes files by requesting them 1MB at a time. And only the next meg after the previous read finishes. Furthermore, each 1MB request shows a download speed, implying each meg from the CD chunk is being downloaded independently. Hashing rates skyrocket with the prefetch settings I've used vs no prefetcher. One thing I'm certain on is that the prefetcher currently only queries subsequent chunk numbers. This is obvious from the technical logs as well. It has some clever logic for existing cached blocks, but it does not find the next chunk number for a file, simply the next chunk in the CD. In my experience, the prefetcher will never prefetch the correct file if it does not use chunks numbered sequentially. Though likely only Alex could give us the definite answer on how it works at the moment. I actually tried windows disk defrag, but for me it caused the drive to disconnect on version 854 during the analyze step.
  22. My config setting is different for seeding (longer time window, more data fetched than one block if using a sequential drive) what I gave was my hashing config. Since we both have 1MB triggers, we both should cache after the client loads the first meg to give to the peer, you are correct that a longer wait time (while having more false positives) will allow for prefetching blocks to slower peer connections. But that impact seems minimal, particularly on scrambled drives, the minimum download size should result in caching slightly more than you need with each read, and if connection speed are really slow CD probably isn't the bottleneck. The limit to a single CD chunk is since, if the files are nonsequential, the next chunk will contain a totally different and useless file. More is better only if data was preallocated, or written to CD after a full download locally first. Moral of the story, always preallocate. There are still significant additional improvements that could be made to the cache and prefetcher to better support this type of usage (as mentioned in my earlier post)
  23. I recently changed quite a few settings, that have greatly improved performance. rtorrent downloads about twice as fast now Overall two settings are crucial: Having the torrent client preallocating files (so that chunks are sequential). This solves many problems, specifically the prefetcher not fetching useful chunks Optimal prefetch settings, the breakdown of this is: 1MB prefetch trigger = the size the torrent client attempts to hash at a time 20MB read ahead = the provider block size the fs was setup with (might want this 1MB lower, as this actually flows into the next chunk, or possibly the exact torrent client chunk size of the file you're hashing) This can (should) be higher if you know a torrent to have it's data stored in sequential chunks in the clouddrive, but if that is not the case, the additional prefetched data will not be useful. 3 second trigger = roughly the longest any read request should take. You want this low enough that apps trickle accessing data don't have enough time to read the trigger amount and cause a useless prefetch, but high enough that the hash check has time to read the 1MB. 1 second works as well for me The remainder of my settings are the established optimal plex settings, same as yours in all other ways, except 5MB min download and different thread counts. The solution really shouldn't depend on users reconfiguring the cache depending on whatever scenario they're undergoing though. This should be an easy change to the cache. If you see a chunk being read 1MB at a time, maybe you should just automatically cache all full chunks following those initial 1MB automatically. Even better if the prefetcher was file boundary and chunk placement aware, and could pull the files next chunk even if it wasn't the sequentially next one.
  24. You've hit exactly upon my goal. Long term seeding without long term storage costs. I'm trying to perform that feat from within the same instance that downloads, important as the drive can only be mounted on any one PC at a time. This instance is a high bandwidth unmetered, but tightly SSD capacity capped VPS, both of these are important as you'll see later. The crux of the problem I'm trying to get addressed here is, performing hash checks on torrents sent straight to the drive from a torrent client (an important distinction from files placed on a clouddrive which were first downloaded entierly), does not work. The fact that hashing straight from the clouddrive takes about a day per 20Gb with 4 cores means no amount of infrastructure optimization or upgrades will make it scale. Particularly since most clients also restrict the hashing process to one torrent at a time. This actually has one important implication on your simple solution This is the ideal pipeline, since quantity of torrents is limited by bandwidth emptying the upload queue add_torrent -> (download->upload_que)(upload->clouddrive) -> seed The only time data is written to local disk is when placing it in the clouddrive upload queue. This allows for torrents larger than the entirety of local storage capacity. While this is fine and dandy, hashing may be required to be performed, either from fastresume corruptions, or unscheduled shutdowns. This simply can't be done on an online copy. So a new process must be performed every time hashing is required: torrent_needing_checking -> (download->local) -> client_hash -> (upload->clouddrive or delete&symlink) -> seed This requires the entirety of the torrent to be stored in local disk space until the torrent is completely finished checking. That means that paralleling the first half of the process is storage limited, torrents larger than local capacity can't be fixed, and you are practically restricted to one or two parallel files being checked. This works alright as I've said before (see individual client problems in my post earlier), but if anything goes wrong during a download, or afterwards the process must be performed entirely anew for every faulty torrent, which is tedious to do manually, harder to do programmatically, and not compatible with my requirement of working with files larger than local storage. Thanks to stablebit's seamless nature in explorer, provided by cache and upload queing, files many times greater than local storage capacity should be simple to store and use directly from the cloud, but the hashing problem makes any direct torrenting (especially large ones) impractical for anything beyond very lite home usage. This style of usage (to my knowledge) can only be done with stablebit or by fusing an rclone mount with a caching system, which is yet to be heavily documented, and is still under development for windows.
  25. I'd say it's far from a great drive to torrent too, but in a pinch it works. To recap the state of the windows torrent clients. All hash impossibly slow (never let a torrent client close with a partial download, ever) Rtorrent hashs a tiny bit quicker, but has download speeds under half of what should be sustainable (probably cygwin overhead). qbitorrent will slowly get more and more disk overloads, before locking up at 100% overload after a few hours. Vuze will download, pause while it flushes to disk, and writes an unusually large amount of extra data, overall probably the slowest client, though it seems to share the same minor hashing optimization as rtorrent. Transmission, uTorrent and Deluge all work fairly similarly to above, I haven't yet done detailed performance testing as they have a bad habits of crashing and needing hashing of even completed downloads. None of them properly utilize the upload que, as it will never exceed a small amount of queued operations I've come to being quite certain that the clients are causing severe fragmentation, that harms prefetching performance. For this reason, I believe the single most important setting any of these clients should have enabled is preallocate storage. This creates a considerable write que, but decreases out of sequence chunks and improves initial downloading speeds (file is cached already), use with caution, as starting multiple torrents simultaneously can be too much for the cache to handle
×
×
  • Create New...