modplan

March 23, 2016

Thanks Chris, started the tracing. Not sure how quick I'll be able to catch it, uploading 24/7, but I'll keep an eye on it. No issues with the first 164GB so far.

March 22, 2016

FYI I'm impatient so I am retrying with a new disk on .533 with the same dataset. Even if I do not see the problem again, I wouldn't be 100% confident the problem is solved, but I'm willing to spend a few days uploading to find out.

My question is, should I turn on drive tracing for the entire process? My C: drive is a small SSD, will the logs get massive and use up a lot of space?

March 19, 2016

Thanks for the response Chris. Not sure I understand, likely my question wasn't clear haha. Do we think in the latest builds this issue is: Not Resolved? Possibly Resolved? Definitely Not Resolved?

If Alex hasn't been able to reproduce....maybe we have no clue and none of the above?

EDIT: Whoops never mind I think you are saying that Alex was on the road to reproducing this issue via test, hit some bugs that could have caused it, fixed some of those bugs, but had to restart the test and has not been able to repro it since? So we still aren't quite sure where we are regarding it? Anything I can do to test/help?

March 18, 2016

He's still working on it. sorry.

However, he has found a few low level bugs because of this, and some of the integrity testing we've (he's) been doing (and some of this may be related to what you're seeing).

Thanks Chris. So I assume then that while my logs uncovered some other bugs for Alex, the fixes in and up to .518 probably haven't solved this particular issue yet?

March 14, 2016

Alex have any luck root causing this yet? Anything I can do or any tests I can run? CloudDrive is currently sitting idle for me until this is resolved, since I don't want to dedicate days of uploading again to have the drive become worthless.

Willing to dedicate some time doing whatever I can to sort this out if Alex could use anything from me.

March 14, 2016

Hi guys,

I just caught up with the changelist, and I noticed that "unencrypted" drives are now actually encrypted with a public static key. Are there plans to publish this key? Has it been published anywhere? I think it's important to have access to it for data recovery purposes.

Thanks!

Expanding on that, do we have a data recovery procedure at all yet? Even if we know the key, that wouldn't help me stitch all those chunks back together and recover my files.

I really like what Arq Backup does. They have an opensource program that is on github that does nothing but download & decrypt chunks, allowing you to download your files. Handy if the developer suddenly disappears, the app stops working years down the road if/when it is no longer supported or any number of scenarios.

The data format is even fully documented. https://sreitshamer.github.io/arq_restore/

Hopefully Alex has something similar on the roadmap (think I may have seen it mentioned that he does).

March 11, 2016

Chris,

Saw Alex's update on the issue analysis. Unable to update there. Please let him know there might have been a power outage several days/a week before the deletes started happening. I know we had a power outage during a storm last week. What I can't remember is whether or not I had created this drive and started uploading yet when the outage happened.

If the power outage did happen after the drive was created, it would have been after all data was copied to the drive and the drive was marked RO, since I remember monitoring the copy operation and the outage was either before that and before the drive was created, or afterwards during the upload cycle, sorry that I can't pinpoint it exactly.

Not sure if that matters, but again, just trying to provide more info.

March 9, 2016

Just for further info. The drive is now completely uploaded. "To Upload" = 0B

I was hoping at the end of the upload cycle CD may re-upload the chunks it previously deleted but that does not appear to be the case.

To Recap:

- ~775GB was copied to the cloud drive and we started uploading

- This specific cloud drive was then marked as read only with diskpart and has been that way the entire time since

- Some previously-uploaded chunks were deleted by clouddrive on 2 separate occasions during the upload cycle

- Each time I noticed deletions happening, upload was paused for a while, on resume, normal uploads, not further deletions, started

- We are now fully uploaded (several days later) and "To Upload" = 0B

All files still show up in windows explorer (pinned metadata I assume) but any file that had a chunk in the deleted range is now of course corrupt.

No significant errors were thrown by the CD GUI during this time.

I'm hopeful Alex can repro, please let me know if I can provide more info.

March 9, 2016

Nope nothing else really uses the account, Google Photos on my phone I guess would be about all. And as we can see in the screenshot, the app that initiated the delete was definitely CloudDrive. Glad Alex is taking a look!

March 8, 2016

Modplan:

As for the deletion, I just wanted to make sure you knew the cases in which we do delete files from the provider, and the *only* cases which it should.

As for the memory test, this is to make sure that the "WriteMap" and other in memory objects aren't getting corrupted by bad memory. And since NTFS uses the memory extensively for caching, it's a good idea in general.

As for the diskpart stuff, was this for the CloudDrive disk, correct?

If so, well I don't think this would cause that issue, but I've let Alex know, just in case it is.

And I'm sorry to hear that it's happened again, but I'm glad to hear you were able to enable logging when this happened. That should definitely help identify why this was happening. And I've flagged the logs for Alex already.

And hopefully, this is an easy to find issue.

Hey Chris, yes I just wanted to be a thorough as possible in my response to aid in getting to the bottom of this.

Yes the readonly flag was set on the clouddrive drive itself, via diskpart. I agree, from my limited understanding that this shouldn't cause an issue, but just wanted to provide all the info I thought of.

Hopefully the logs tell the full story.

March 8, 2016

We just went on another chunk deletion spree, starting about 20 minutes ago according to the GDrive GUI.

I enabled Drive Tracing for about 5 minutes and let it continue deleting. I then paused uploads and am collecting/uploading logs.

Very very interested to see what is shown in the logs.

March 8, 2016

Possibly relevant

AFTER all data had been copied to this drive and was in the To Upload "cache". The drive was marked as read only with diskpart:

att vol set readonly

I do this semi-often with physical drives that are meant for archive purposes ONLY, in order to prevent any kind of corruption, accidental deletion, etc. They are only marked as RW when data needs to be written, then back to RO they go.

This is the first time I have done this on a CloudDrive drive. I do not think setting this NTFS flag should have any interaction with CloudDrive or cause any issues, but I wanted to point this out, since it is the only anomaly I can currently find that is different than my previous testing and uploads.

March 8, 2016

Thanks Chris. Just to point some things out one by one:

To let you know, there are a few circumstances that a provider can delete chunks.

Destroying a drive

chunk contains only 0's (provider specific, Google Drive being one of them).

(Google Drive only) When the mime type error is generated, delete the chunk and re-upload it

These are the only times that it should ever delete a chunk on the pool. Period.

The drive was not destroyed, we were just uploading along.
As for only zeroes, well I guess that could be a possibility, but only if clouddrive was actively overwriting these previously uploaded chunks with all 0's and I don't see why it would do so.
No MIME errors were generated in the GUI (very familiar with this error, believe I was the first to report it) and the chunks were never re-uploaded, theyre gone.

However, what you're describing about the "0%" parts, I've talked with Alex already, and this is normal. At least for chunks with no data.

The chunks had data, they had been uploaded over 24 hours prior, then I saw tons of writes at 0% and then Google Drive reported the chunks as "permanently deleted", I verified the exact chunk IDs I saw at 0% were the exact ones shown as deleted in the GDrive GUI. Unless you are commenting on the prefetch threads stuck at 0%? If so, the chunks not only have no data now, they are completely gone from Google Drive.

That said, this is very very odd, and you're the only one to report something like this. And Alex has been doing extensive testing for at least the last few days.

I agree, I've had to have uploaded 10-15TB to various drives on almost every BETA version testing this product over the past 3-4 months. This is the first time I've seen this. And to see it happen on its own in the middle of an upload cycle with no action or intervention on my part, is scary.

Also, could you check the Event Viewer on the system? Check the "WIndows Logs", and the "System" section. Check for any disk, ntfs or controller related errors. That or just export the entire log (unfiltered) and upload it using the same link you used for the logs. That way, we can take a look at it.

Also, I'd highly recommend running a memory test (extended), if you haven't already done so recently.

Don't see anything out of the ordinary in 'System' nor 'Application' around the time this started. A memtest is likely worth doing, but I don't think a bad DIMM would cause clouddrive to suddenly go on a deletion spree. As we can see in the above google drive screenshot, google lists the application that initiated the action, in this case a permanent delete, and we can see "Stablebit Clo..."

And it may be worth turning on Upload Verification, in this case. That downloads and checks the chunks after upload. That may help prevent issues from occurring.

Upload verification is on and has been on since this drive's creation. In fact, after the writes that were at 0% that deleted the chunks were finished, Read verify threads were spawned, which were also stuck at 0%, effectively verifying the delete. It really seems to me that Cloud Drive really thought it should delete these chunks, and even verified doing so, but it caused corruption.

-------------------

I'm not terribly concerned about this specific data, it was a backup. All I will lose if I have to blow this drive away and start over is a few days of upload time. However I want to provide as much info as possible to hopefully root cause this and prevent this from ever happening again, to me, or anyone else. Right now I have files that show up in windows explorer that look normal (due to pinned metadata I assume) but that are completely corrupt. If I had not been watching "Technical Details" when this occurred and started investigating I would assume that all my data is perfectly safe right now in the cloud, it isn't.

-------------------

I resumed the upload cycle late last night several hours after my post, since I've pretty much written off this drive and all of its data, I figured I would let it resume uploading and see if it somehow magically re-uploaded all of the chunks it deleted. If it does, we still have a problem, since for about 24 hours these chunks will have existed neither in the cloud nor in the cache, so the files are corrupt, but I would be glad if all of my data did eventually become whole.

So far, no more deletions have occurred, all night we were chugging along uploading. Currently we are uploading in the chunk ID range of 57,XXX, a far cry from the ~28,800 to ~31,300 range that was deleted. But maybe we will circle back after all new data has been uploaded and re-upload these chunks? We'll see, in about 20 hours.

March 8, 2016

Google Drive

.486

win2k8r2

I've paused upload threads and it has stopped, but about 2500 chunks were just "permanently deleted"

I copied about 775GB into a brand new 10TB drive. Over 440GB of that was successfully uploaded and it was chugging along like normal. Nothing changed. Watching technical details I noticed suddenly the upload threads were popping up at 0%, never progressing, then new ones would pop up at 0% and this would continue over and over. No errors were thrown by the GUI, nothing abnormal in the logs.

I logged in to the Google Drive web GUI and I see tons of this:

Via trial and error I found some video files that had chunks in this range (~28,800 to ~31,300) When I try to play these files, tons of prefetch threads spawn in this range, stay at 0% and the "Prefetched" count jumps up to 500mb instantly. The file never plays in VLC. If I try to open an image file that is in chunks in this range, windows image viewer tells me it is corrupt.

Any ideas how this could possibly happen? Pretty disappointed right now, this is some serious corruption.

EDIT: I've uploaded what logs I have, but I do not see anything abnormal in them. I did NOT have drive tracing enabled and am obviously scared to enable it and resume uploads for fear of more chunks being deleted from the cloud!

March 5, 2016

Well, Alex posted a reply. There isn't a "fix" yet, but it's on his mind.

That said, it may be that the specific chunk in question was being repeatedly updated, causing it to be uploaded again and again.

Hey Chris,

I've seen this issue as well, normally with lower chunks (often chunk# 55 or 64 in my case). It appears to me that Cloud Drive logs the updates to that chunk and uploads each single update individually, instead of coalescing all of the updates together and uploading a single chunk. Note that I see this after setting "upload threads" to 0, and then setting them back to the normal value after my copy/change/write is done, so the chunk is definitely not being continuously updated during the upload. I would assume all updates to the chunk would then coalesced into a single upload, instead of uploading the chunk over and over for each previous update done. Just wanted to throw in my .02, I see this when updating the archive attribute on files with backup software. That attribute likely exists for each file in the MFT that is stored on these blocks that get continuously uploaded (64, 55, whatever).

Maybe that helps shed some light? My upload is fast, so it is easy to ignore for me, but it does waste time.

EDIT: Take a cloud drive drive with many files, change some NTFS attributes on some/all files with upload threads set to 0. Turn the upload threads back on and I bet you will see a single, or several chunks, upload over and over again. A good way to repro this.

EDIT2: I work in SAN dev for an enterprise SAN provider, I have no insight on how Alex's Windows driver works, but feel free to PM me for more info, I faint hunch as to what is going on here from a SCSI perspective, more than willing to help root cause this and get to the bottom of it (along with any other issues I can help with). But I doubt you guys need my help if you can repro it

March 2, 2016

Thanks so much for the info.

For curiosity sake,

Since I can only have the drive attached to 1 CloudDrive installation at a time,

can I share this drive in Windows or SMB for other computers to have access to without any issues? I assume yes, but I want to make sure .

Yes

March 1, 2016

I just got a checksum mismatch on one of my disks. chkdsk shows no errors. What is my course of action here? A few files I checked seem fine, is there some silent corruption going on somewhere? I got this error when changing the label of the drive....however I think I may have seen it before on this drive a while back. Upload verification has been turned on and off multiple times on this drive while we have gone through all the betas, and while I have been tweaking my upload/download threads for maximizing performance to see if I could tolerate upload verification being turned on (it is currently ON).

Google Drive

.470

win2k8r2

February 25, 2016

http://community.covecube.com/index.php?/topic/1610-how-the-stablebit-clouddrive-cache-works/

Might be by design:

Thanks thnz, looks like you are right.

Which leads to a broader conversation. Is this the right way to do this for drives with large caches? Could a "chunk tracking database" that marks local cached chunks as perfectly uploaded be used to prevent the wholesale re-upload of the cache? Only re-upload the chunks that were not marked in the database as previously successfully uploaded? If someone with somewhat limited bandwidth sets a 500GB cache on a large drive, suffers a power outage, but 498GB of that cache was previously perfectly uploaded, this wholesale re-upload would take weeks.

Edit: If a "chunk tracking database" is not viable, maybe download the chunks and compare the checksums to the local chunks? Most people have MUCH faster download, than upload. So only the needed chunks would be re-uploaded.

February 25, 2016

What OS are you using?

What version of StableBit CloudDrive are you using?

What is the size of the drive?

What is the size of the cache (You've indicated that it's 100GBs, but I'm not entirely sure about this).

To clarify, did you restore any files or the system from the backup, and then continue backing up the system?

If so, that *may* have been the cause of the issue here, specifically.

Otherwise, could you let me know exactly what happened?

Sorry, info I know I should have provided.

2k8 R2

.470 (But have seen this several times while using CloudDrive, any time there is an unexpected reboot basically, version does not matter)

Drive is 10TB

Cache is currently 100GB, I've been playing with different sizes.

No files have been restored. In fact this seems to have nothing to do with running a backup, in my experience, any local cache data is moved to "To Upload" after "Drive Recovery" after an unexpected reboot. This is just the only drive I currently have a Cache on. I have seen this on other drives just with standard files, when I had a cache on them.

Basically:

- Create Drive with a cache

- Add data to the drive, ensure that the cache has some data in it

- Ensure "To Upload" = 0B

- Pull the power plug/force BSOD

- On Boot, drive will go through lengthy "Drive Recovery" procedure

- After "Drive Recovery", ALL local data from the cache is added to "To Upload" and re-upload begins, ALL this data already exists in the cloud

- If you have a large cache, this takes forever

Hope that makes sense?

February 24, 2016

I see the "ID" column progressing in his video, I assumed this was the Upload ID, but the ID in the "Chunk" column stays the same in his video.

February 24, 2016

Hi,

Are there any plans to support the EMC Atmos storage?

I could certainly be wrong, but I think this is more of a consumer focused product. Does Atmos have an S3 compatible API like NetApp StorageGRID does? I could see CloudDrive maybe implementing a generic S3 API driver, like some other cloud facing apps have done.

Ginoliggime · February 23, 2016

Sorry if this has been covered a quick search did not find me what I was looking for.

I have a CloudDrive that I send windows server backups to nightly. The full backup size is about 75 GB, but the nightly incremental change is only 6-7GB, easily uploadable in my backup window.

I have set the cache on this drive to 100GB to ensure the majority of the "full backup" data is stored in cache, so that when windows server backup is comparing blocks to determine the incremental changes, clouddrive does not have to (slowly) download ~75GB of data for comparison every single night.

This works very well.

The problem comes when there is a power outage, crash, BSOD, etc. Even though the clouddrive is fully current and "To Upload" is 0, when I bring the server back up, after we go though drive recovery (which takes 5-8 hours for this 100GB), cloud drive then moves ALL 100GB of the cache into "To Upload" and starts re-uploading all of that data.

Why? I can't think of how this is necessary. Incase a little data was written to the cache at the last minute before the unexpected reboot? If so, certainly there is a better way of handling this than a 100GB re-upload, some sort of new-unuploaded-blocks tag/database? What if a drive has a massive cache, a re-upload could take days/weeks!

Thanks for any insight, I've gone through this process a couple of times using clouddrive and it has been painful every time. I'd be happy even if we downloaded every single block that is cached, compared it to the local cache block, and then only uploaded the ones that have changed.

February 23, 2016

I don't even reboot. If you stop the service before running the update installer, it does not prompt you to reboot and starts the service back up for you.

February 22, 2016

Okay, thanks for confirming.

I've just talked with Alex directly about this. The next time this occurs, could you grab and download the actual chunk before deleting it, and upload it to us?

Sure.

Christopher (Drashna) · February 22, 2016

the way I have it setup is I just share the drive through windows so other machines on my network can see & access it. seems to work fine

Yes it sounds like the OP wants to share to his parents which live further away. However this can easily be done by sharing the drive like you say, and then using a VPN service like Hamachi.

Sign In

modplan

Posts

Joined

Last visited

Days Won

Content Type

Profiles

Forums

Posts posted by modplan

Corruption: Clouddrive just started mass deleting chunks

Corruption: Clouddrive just started mass deleting chunks

Corruption: Clouddrive just started mass deleting chunks

Corruption: Clouddrive just started mass deleting chunks

Corruption: Clouddrive just started mass deleting chunks

Public encryption key

Corruption: Clouddrive just started mass deleting chunks

Corruption: Clouddrive just started mass deleting chunks

Corruption: Clouddrive just started mass deleting chunks

Corruption: Clouddrive just started mass deleting chunks

Corruption: Clouddrive just started mass deleting chunks

Corruption: Clouddrive just started mass deleting chunks

Corruption: Clouddrive just started mass deleting chunks

Corruption: Clouddrive just started mass deleting chunks

Uploading same chunk multiple times

Best practice for migration

Checksum Mismatch....What do I do?

Full Cache Re-upload after crash?

Full Cache Re-upload after crash?

Uploading same chunk multiple times

Is there any plans to support EMC Atmos

Full Cache Re-upload after crash?

Best practices for build to build upgrade

Google Drive: Cant Upload, 'The requested mime type change is forbidden'

Setup Same CloudDrive on multiple Computers?

Browse

Activity