Jump to content

modplan

Members
  • Posts

    101
  • Joined

  • Last visited

  • Days Won

    3

Posts posted by modplan

  1. Some 5 year old 8 core AMD. Never seen CloudDrive use more than 1-2% while actively uploading/downloading. 20mbps upload, 300mbps download, maxing out either doesn't matter.

     

    Just to note I AM using encryption as well. Just not pushing/pulling quite as many chunks as you are I guess.

  2. but somehow they have to make sure that files are being prioritised equally sotno reads are being blocked because prefetching was fully used by another file. must be a solution for this

     

    Right. I'm not saying I disagree with you, just based on how the current architecture has been described, I'm not sure if it is possible. A solution that might help if all of your files are written sequentially is that block ranges are prioritized equally. For example if you are trying to read chunks 1012-1085 and chunks 8753-9125 at the same time, those would considered separate "files" and prioritized equally. Seems like a logic headache from a code perspective though, and if your drive has random writes, or updated chunks of a file outside of the file's main "chunk range", this algorithm would all fall apart quickly.

  3. Any way to pause writing (uploading) and then resume it later via script, preferably batch? I would like to pause uploads programmatically while other scheduled tasks are running that need the upload bandwidth, and then resume them later.

  4. There is definitely something going on. I am still getting the [ApiGoogleDrive:39] Google Drive returned error (userRateLimitExceeded): User Rate Limit Exceeded  and also getting server is being throttled errors..

     

    its been 2 weeks since i seen a post from Christopher on this thread, hoping they are just busy trying to figure it out the issue.

     

    Google is limiting you for making too many API calls/sec, not CloudDrive. No way for them to get around this. You could raise your chunk size so that you are making less API calls per second. Google does not seem to care how much bandwidth you are pushing/pulling, just how many API calls you make. With larger chunks you will obviously make less API calls to upload/download the same amount of data.

  5.  

    There does seem to be an issue with the minimum chunk size.  

     

    1. I started a file transfer from stablebit clouddrive to a local disk on my server with 20MB chunks and no minimum chunk size.  Stablebit was using all 15 threads that I assigned and said that the file was downloading at roughly 60Mb/s.  I monitored the actual speed of the file transfer and it was downloading at an average of 6 MB/s which is about right.

     

    2.  I then started this same exact test, but this time I used 20MB minimum chunks with the 20MB chunks.  Stablebit was using all 15 threads and was reporting that it was downloading the file at 400Mb/s.  I monitored the actual speed of the file transfer and it was downloading at the same 6 MB/s speed as before I enabled the minimum chunk size even though stablebit thought that it was downloading faster.

     

    You can easily test this yourself by doing the same steps that I took above.

     

    Note: I monitored the transfer speed with both Ultracopier and Windows built in file copy.  The transfers were to a very fast ssd.

     

     

    Seeing the same thing. Turned it off for now until it is fixed.

  6. Well, I'm in the last 2 hours of uploading my dataset, all has been good so far. 

     

    And bam, Alex's change saved over 900 chunks from being deleted!

     

    21oygb4.jpg

     

    I saw this occurring when it was at the ~500 mark and flipped on drive tracing. I'm uploading those logs now. It's still trying to delete chunks, up to 1,300 in the time it took me to write this post, but, my chunks are safe, nothing has been deleted from the cloud!

     

    Edit: it looks like it is continuously trying to delete these chunks, retrying and failing over and over and over. Will it ever break out of this (possibly) infinite loop and move on to uploading the final 20GB?

     

    Edit2: I paused uploading, and then resumed it, and same as in the past, this caused CloudDrive to stop trying to delete chunks, and I'm back to uploading now.

     

    Edit3: I'm in the process of verifying every single file via md5 hash with ExactFile, it'll take a while, but I'll post results when I have them.

  7. While I would love this, since CloudDrive seems to be largely build around 1MB+ chunks, I really do not think dedupe would be very effective if this is the level at which dedupe would have to be done at. But maybe CloudDrive's architecture allows for some sort of sub-chunk dedupe?

     

    Most dedupe on enterprise arrays are done at the 4k-8k block size level, once you get too far past that it becomes less and less likely the blocks will match exactly and dedupe loses its effectiveness.

  8. When you are writing data to the drive, the cache will expand as needed. If you write more than 50GB to the drive faster than you can upload, the cache will continually expand as much as possible until the drive the cache is on is almost full. As data is uploaded, it will be deleted from the cache.

  9. I am planning on using ExactFile to do the comparison. It will create an md5 hash for every file in a directory, that can then be ran against and compared to the files in a different directory. 

     

    I see this in the latest changelog:

    .536
    * Fixed crash on service start.
    * Never allow NULL chunks to be uploaded to providers that are encrypting their data and that do not support partial writes.

     

    I'll upgrade to .536, create a new drive, and get to testing again!

  10. That said, checksums of the actual files is really important here. To (again) verify that the data has actually been affected.  And in part how.  I don't mean the upload verification here (but that's never a bad idea to have enabled).  But checksum on the affected data on the drive, itself. 

     

     

    Additionally, chunks can be deleted if the data in them is zero'ed out, as there is no reason to store a chunk with absolutely no data. It's more efficient to send a delete request to the chunk than it is to upload the chunk with all zeroes. 

     

     

    That said, for testing this, it may be a good idea to set the upload threads to "0" while you're not at the computer, so it effectively pauses the upload. This way, you can keep an eye on it, and make sure that you're able to get the logging for the issue when it actually happens. 

     

    And Alex and I really, really want to resolve this as soon as possible, not only because this is an integrity issue, but because we don't feel comfortable about releasing a "Stable" version until we've either fixed this issue or confirmed that what you're seeing is expected behavior at this point. 

     

    To address each of those in order.

     

    1) Checksum verification is indeed on, and the chunks pass. I see in "Technical Details" when I try to open one of the corrupt files, that it is indeed trying to download the deleted chunks, but the progress sits at 0% and then is discarded (rather quickly) from the details window. I think we discussed this a page or two back and you indicated Alex said this is normal for a deleted chunk. It appears, to me, that CloudDrive EXPECTS these chunks to not exist, it knows it deleted them.

     

    2) I really think, based on what we have discussed and the only ways chunks can get deleted, that the service or driver is thinking these chunks are being zeroed when they really aren't. I'm not sure what could cause this. EDIT: I very much doubt but wanted to make sure: if a file is in use/inaccessible it wouldn't cause some type of collision and cause this?

     

    3) Turning off upload threads would certainly help me keep a better eye on this to collect logs, however this is a home server, that I only connect to via RDP. I could monitor it better while I was home in the evenings, but it would likely take weeks of me turning upload threads on and off to get to the point where this is repro'd and I caught it. Also I believe my 2nd log upload I caught it in the act (by luck), was not much useful gained from that? Has additional logging been put in place since then to increase supportability?

     

    4) Me too and I'm at your disposal to help! I really want to get this archive data safely into the cloud ASAP :)

  11. Do you have any sync services setup on your cloud drive such as cloudhq.net that could be deleting these chunks?  I have 40TB + uploaded to Google Drive using Stablebit and haven't had any chunks randomly deleted like this.

     

    Nope, nothing. But the scary part is, unless you log in to your Google Drive account on your browser, and scan the entire history of the "Stablebit CloudDrive" folder looking for deletes like you see in my screenshot above, that should not have taken place, I have no idea how you would know this is going on, it is completely silent. All the files appear to still be there on your drive, but some random chunks are missing so some of them are silently corrupt.

     

    Edit: And you can see in my screenshot the chunks were deleted by "StableBit C..." not some other application.

  12. You're sure that the application didnt just make new files on the drive to replace old ones?

     

    Not doing any kind of application work. Just copying files over from one drive to CloudDrive and randomly through the copy process (usually several hundred GBs in) CloudDrive just starts deleting chunks from the cloud on its own. Any files who had data stored in any of the deleted chunks, become instantly corrupt, but still show up in Explorer as normal.

  13. Yeah, it would have wrapped, if it's been doing stuff afterwards. 

     

    And assuming 20MB chunk sizes, that's 14GBs of data, basically. 

     

    And since I'm assuming that you're not deleting contents from the CloudDrive disk... :(

     

    Not deleting anything, and the FS still thinks the files are there, their backing chunks in the cloud are just gone :(

  14. Unfortunately it happened again :( And it was over night (about 20 hours ago) so I am assuming the disk traces have wrapped? 

     

    Also it is worth noting that both times I was copying FROM the Google Drive Sync Folder TO Cloud Drive. I doubt this matters, but another data point. Again the range deleted was a range that was copied over and uploaded a day or so before. I'm going to run ExactFile on what I've copied so far to compare the MD5s and see exactly what has corrupted.

     

     

    dnhij5.png

×
×
  • Create New...