Jump to content

Google Drive Downtime and RAW again...


Recommended Posts

So a couple of weeks back we had a downtime and all my drives were marked as RAW after.

Last night it happend again. One of my 2 cloud drives is marked as RAW... I had upload verficitation on since the last incident and somehow it happend again....

25 TB of data is gone. Anyone saw a similar thing last night? It is getting annoying to rebuild the entire clouddrive every x weeks since there are downtimes...
I was told upload verfication should solve the problem, in fact I just wasted 25 TB of traffic and load on my SSD....

Would like to hear if someone else is also affected?

Link to comment
Share on other sites

Man I'm starting to get stressed. I haven't been able to detach my drives for over a week and I'm not getting any response. I have 52 TB in the cloud at the moment.. and IDK how I'm going to get that data to the right machine since I can't detach. 

I heard that google drive has been having some outages for the past couple of days (less than the time I've had issues) and maybe that could explain your situation. 

They had cloud outages.

Link to comment
Share on other sites

  • 2 weeks later...

Sorry for the absence.

To clarify some things here.   

During the Google Drive outages, it looks like their failover is returning stale/out of date data.  Since it has a valid checksum, it's not corrupted, itself.  But it's no longer valid, and corrupts the file system.  

This ... is unusual, for a number of reasons, and we've only see this issue with Google Drive. Other providers have handled failover/outages in a way that hasn't caused this issue.  So, this issue is provider side, unexpected behavior. 

That said, we have a new beta version out that should better handle this situation, and prevent the corruption from happening in the future.  


* Improved chunk validation.
    - For indexed providers, if a valid but unexpected block (that was cryptographically signed by us) comes back from a 
      storage provider, do not accept the data and issue a warning stating that the data was corrupted at the storage provider.
    - When this happens, the block will be retried 3 times and if the problem continues your cloud drive will be force unmounted.
    - This prevents accepting data that is stale or out of order.


This does not fix existing disks, but should prevent this from happening in the future. 

Unfortunately, we're not 100% certain that this is the exact cause. and it won't be possible to test until another outage occurs.

Link to comment
Share on other sites

7 hours ago, Christopher (Drashna) said:

This does not fix existing disks, but should prevent this from happening in the future. 

Unfortunately, we're not 100% certain that this is the exact cause. and it won't be possible to test until another outage occurs.

Is the issue that if google has rolled back data... that the file system chunks are out of sync between what is stored locally and what is stored in the cloud?

Is this solution meant to force a repull of all the file system data from the cloud on a remount?

Is it possible to store all file system data locally and if there is a data mismatch due to an outage or provider rollback the local, most up to date version gets priority?

For something like media files or other data that rarely or never changes, a rollback seems inconsequential, so I assume the issues are primarily with file system data.

Just some questions from someone who doesn't know how everything works behind the scenes, thank you!

Link to comment
Share on other sites

Honestly, we're not entirely sure what the issue is.  Since it's impossible to reproduce for us, it's a lot of guesswork. 

However, the fact that people have not seen checksum errors when this happens indicates that the data itself isn't corrupted.   So, the most likely reason is that old data is being returned.  This would pass the checksums with flying colors, but the data would be out of date.  If that was file system data, that would lead to a corrupt file system.    

Which ... sounds like what is exactly what is going on.     


So the changed code actually stores the checksum locally (and uploads it to the cloud provider when detaching  the drive), so that it can compare what is pulled from the cloud and what we know should be the latest version of that chunk.    This will prevent stale data from being downloaded, and unmount the drive, if it happens too much. 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...