Jump to content
  • 0

Many files in Google Drive, something to be aware of


modplan

Question

Just a PSA post here. I recently passed 1.26 MILLION files in my google drive account. Only ~400,000 of these files were from CloudDrive, but I use lots of other apps that write lots of files to my account and thought I would pass on this info.

 

1) Google Drive for Windows client will crash, even if you only have 1 folder synced to your desktop: I have only a few docs folders actually synced to my desktop client, but the app insists on downloading an entire index of all of your files on your account into memory, before writing it to disk, when this crosses 2.1GB of RAM, as it did for me after 1.26 million files, the app will crash (due to Google Drive for Windows still stupidly being 32-bit). No workaround other than lowering your number of files on your account.

 

2) Google Drive API documentation warns of API breakdowns as you cross 1 million files on your account, query sorting can cease to function, etc, who knows which apps depend on API calls that could start to fail.

 

I've spent the last 10 days running a python script I wrote to delete un wanted/needed files, one by one. 10 days, and I probably have 10 days left. I hope to get to ~600,000 total by the time I am done. 

 

Hope this helps someone in the future.

Link to comment
Share on other sites

11 answers to this question

Recommended Posts

  • 0

It's not currently planned.  meaning that we need to look into this.  

 

One of the reasons that we we limit Google Drive to 20MBs, is that we were using 1MB partial reads of the file. Google has issues if you perform more than 20 reads on a file in a short period of time. To prevent the issues we were seeing with the 100MB chunks, we limited the size.

 

However, since then, we've added the option for larger partial read sizes.  So, we need to re-evaluate this.  And if we allow larger chunks, we need to increase the partial chunk sizes, as well.  And we need to make sure that we do this in such a way that won't adversely affect performance.  

 

So it's not as easy as "just enabling it". We need to see how it affects the drive performance as well.  People with high bandwidth, it may not be an issue at all.  But for people with lower bandwidth, this could cause serious performance and IO issues.  

Link to comment
Share on other sites

  • 0

It's not currently planned.  meaning that we need to look into this.  

 

One of the reasons that we we limit Google Drive to 20MBs, is that we were using 1MB partial reads of the file. Google has issues if you perform more than 20 reads on a file in a short period of time. To prevent the issues we were seeing with the 100MB chunks, we limited the size.

 

However, since then, we've added the option for larger partial read sizes.  So, we need to re-evaluate this.  And if we allow larger chunks, we need to increase the partial chunk sizes, as well.  And we need to make sure that we do this in such a way that won't adversely affect performance.  

 

So it's not as easy as "just enabling it". We need to see how it affects the drive performance as well.  People with high bandwidth, it may not be an issue at all.  But for people with lower bandwidth, this could cause serious performance and IO issues.  

 

So for larger chunks just warn about high bandwidth requirement :-) Not all settings will work for all in every application.

Link to comment
Share on other sites

  • 0

I'm curious what will happen if I upload my 100mb chunk amazon cloud Stablebit folder to Google Drive and try to connect to it using the google drive connection?

 

I imagine I won't be able to? or bad things will happen?

 

Won't be able to, for the most part.  The provider is IDed amount other things.  

 

So while you *may* be able to get it to work, it may cause issues, at best. 

Link to comment
Share on other sites

  • 0

Do we know for certain that this will adversely affect CloudDrive (I too am planning on blowing through the million file limit easily with only 20mb chunks)? Reading through the api documentation V1-V3 the only issue seems to be below:

 

orderBy

 

A comma-separated list of sort keys. Valid keys are 'createdDate', 'folder', 'lastViewedByMeDate', 'modifiedByMeDate', 'modifiedDate', 'quotaBytesUsed', 'recency', 'sharedWithMeDate', 'starred', and 'title'. Each key sorts ascending by default, but may be reversed with the 'desc' modifier. Example usage: ?orderBy=folder,modifiedDate desc,title. Please note that there is a current limitation for users with approximately one million files in which the requested sort order is ignored. (string)

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...