Help me understand DrivePool and CloudDrive interaction

chiamarc · November 2, 2017

Hi Folks (especially Chris),

I'm especially frustrated right now because of a dumb mistake on my part and a high likelihood of a misunderstanding of the intricacies of how, when, and why DP is balancing and duplicating to a cloud drive. My setup is a local pool balanced across 5 hard drives with several folders duplicated x2 comprising ~4.1TB. The local drivepool is part of a master pool that also contains a cloud drive. The cloud drive is only allowed to contain duplicated data and currently it is storing about 1TB of duplicates from the local pool. I only have ~5Mbps upload bandwidth and I just spent the month of October duplicating to this cloud drive. Yesterday I wanted to remove my local pool from the master pool because I was experiencing slow access to photos for some reason and I was also going to explore a different strategy of just backing up to the cloud drive instead (which allows for versioning). Well, I accidentally removed my cloud drive from the pool. At the time, CD still had about 125G to upload, so I assume that was in the write cache because DP was showing optimal balance and duplication. When the drive was removed of course, those writes were no longer necessary and were removed from CloudDrive's queue. OK, I didn't panic, but I wanted to make sure that the time I just spent using my last courtesy month of bandwidth over 1TB was not wasted. So I added the cloud drive back into the master pool, expecting DP to do a scan and reissue the necessary write requests to duplicate the as yet unduplicated 125G. But lo and behold after balancing/duplication was complete in DP, I look at the CD queue and I see 536G left "to upload"! All I can say at this point is WTF? There was very little intervening time between when I removed the cloud drive and re-added it and almost nothing changed in the duplicated directories.

Can someone please explain or at least theorize? I own DrivePool but I've been testing CloudDrive for a while now for this very reason. I needed to assess its performance and functionality and so far it's been a very mixed bag, partly because it's relatively inscrutable.

Thanks,

Marc

Christopher (Drashna) · November 2, 2017

Unfortunately, this *can* happen.

StableBit CloudDrive isn't writing the files to the provider that you're writing to the disk.

Specifically, StableBit CloudDrive works like a normal disk. You write files to the drive, and the drive itself writes blocks of data. These blocks are what are uploaded to the provider, and downloaded.

NTFS (well, every file system) will use different blocks, based on a number of factors. So, likely, when re-moving the files onto the drive, it's writing the data to different blocks.

This is actually what causes disk fragmentation on normal drives (and technically, on CloudDrive drives as well).

Unfortunately, after the fact, there isn't anything that really can be done, without risking the data.

chiamarc · November 3, 2017

But that doesn't make sense to me. How can you check what needs to be uploaded if there can always be a discrepancy between NTFS blocks and those that comprise the chunks uploaded to the provider? If *nothing* (or only a little) changed on the local pool, how can there be more than 4 times the data that was still waiting to be uploaded previously? If DrivePool is writing files to the cloud drive, which get turned into NTFS blocks in the local cache, then aggregated into 100MB chunks(?) and uploaded to the provider, how can the total size of those chunks be much more than the size of the files written plus metadata? Even if NTFS is writing to different blocks, unless we're dealing with sparse files, wouldn't CD just chunk the blocks that belong to the files waiting to be written? What's in the other 425G worth of chunks? Oh, I'm starting to get the picture: it's data from previously written files or whatever because you can't change the chunk on the provider piecemeal. So if I upload a chunk C originally that contains the blocks of 1,000 files and then change one of those blocks, CD has to upload another 100MB chunk C' to replace C? The problem of fragmentation is highly exacerbated here because the chunks are so large and the chunks are large for, among other reasons, transmission efficiency. This must really stick in Alex's craw!

chiamarc · December 7, 2017

After a recent BSOD, something like this just happened again. The "to upload" was down to around 82G and now it's back up to 175G+. There has got to be a way to prevent this from happening... :angry:

Christopher (Drashna) · December 10, 2017

Well, for the crash, we can't really trust that the local information about what has been uploaded or not. So we re-upload everything in the cache.

As for little to nothing, ... do you have thumbnail generation disabled? Is anything indexing the drive? Modifying the access time, or even modifying the data? Also any and all metadata (which you do not see)?

Additionally, Windows enables "shadow copies" on drives by default. This is "versioning" information, and can use up to 10% of the disk by default. If you're adding or modifying data a lot, this can bloat the usage, a lot.

As for transmission efficiency, we actually use partial reads and writes where applicable.

chiamarc · December 10, 2017

Quote

Well, for the crash, we can't really trust that the local information about what has been uploaded or not. So we re-upload everything in the cache.

I would think this is transactional. Doesn't the cloud service API confirm that something has been committed? Additionally, why wouldn't it be possible to have a journal (in the cloud) for "blocks" written to the cloud? If there's a crash just verify the journal.

Given what you know about Windows, can you give me a series of steps (like the aforementioned disabling of thumbnails) that will minimize my cache re-upload after a crash? For whatever reason I've experienced many crashes in the last 10 months and frankly, with my limited bandwidth, I just can't afford to keep doing this. I really like CloudDrive and I don't think I'm an odd use case (backing up to it).

Christopher (Drashna) · December 11, 2017

Because we couldn't trust it locally, as the data may not have been properly flushed to disk, yet.

Made more complicated is that data has to be flushed to the CloudDrive disk, and then that has to be flushed to the cache drive. And *then* it gets uploaded. And an upload could have been "in progress". Which means it wouldn't be tracked properly, and it would need to be re-uploaded ANYWAYS.

As for the journalling, that's a relatively large database, and tracking this would (likely) be a significant performance penalty.

chiamarc · February 19, 2018

OK, this may be the last and final straw. My cache was completely consistent for 2-3 days, nothing to upload. After what I thought was a normal shutdown and no "normal" writes to the cloud drive (only reading of metadata by a backup program which, e.g., does not modify the archive bit), the CD service determined that there had been something that caused a provider resynchronization. I find myself once again uploading close to 50 GB. As I've stated before, I have limited total monthly bandwidth so this is a serious issue for me. In my previous post I asked how I can mitigate the amount of data that needs to be resynchronized/uploaded after a crash or what have you. If this cannot be improved then I have no choice but to seek a solution other than Cloud Drive. I will not bad mouth the product because I think you and Alex are doing an amazing job in general. I just find it hard to believe that there aren't more people with my problem (I've definitely seen some on the forums with the resync on crash problem) that would at least support the notion that re-sync should be redesigned. I still don't understand the mapping between sets of NTFS blocks and CD "blocks" that make some form of local journaling/tracking impractical. I mean please, I'm a computer scientist; give us an example of what you're talking about in your 12/10/17 post.

Sign In

Help me understand DrivePool and CloudDrive interaction

Question

chiamarc

7 answers to this question

Recommended Posts

Christopher (Drashna)

chiamarc

chiamarc

Christopher (Drashna)

chiamarc

Christopher (Drashna)

chiamarc

Join the conversation

Browse

Activity