Jump to content

  • Log in with Twitter Log in with Windows Live Log In with Google      Sign In   
  • Create Account

Photo

Reindexing Google Internal Server Errors


  • Please log in to reply
30 replies to this topic

#1 srcrist

srcrist

    Advanced Member

  • Members
  • PipPipPip
  • 79 posts

Posted 04 July 2017 - 08:15 AM

So my service crashed last night. I opened a ticket and sent you the logs to take a look at, so we can set that aside. But while it was reindexing it got one of the Internal Server Error responses from Google Drive. Just one. Then it started reindexing the entire drive again starting at chunk 4,300,000 or so. Does it really have to do that? This wouldn't have been a big deal when this drive was small...but this process takes about 8 hours now every time it has to reindex the drive, and it happened at around the halfway mark. Four hours of lost time is frustrating. Does it HAVE to start over? Can it not just retry at the point where it got the error? Am I missing something?

 

Just wanted to see what the thought was on this. 


  • KiaraEvirm and Antoineki like this

#2 Christopher (Drashna)

Christopher (Drashna)

    Customer and Technical Support

  • Administrators
  • 8,208 posts
  • LocationSan Diego, CA, USA

Posted 11 July 2017 - 12:47 AM

I believe I did respond to this already.

 

As for re-indexing.... the only time this should happen is if the "chunk ID" database is lost.   If something happened to that file, then it would trigger this to occur. 


Christopher Courtney

aka "Drashna"

Microsoft MVP for Windows Home Server 2009-2012

Lead Moderator for We Got Served

Moderator for Home Server Show

 

This is my server

 

Lots of "Other" data on your pool? Read about what it is here.


#3 teshiburu

teshiburu

    Newbie

  • Members
  • Pip
  • 1 posts

Posted 12 July 2017 - 04:36 PM

@srcrist - this keeps happening to me today, did you managed to resolve it? 

 

Its done this 2-3 times for me today!!



#4 Christopher (Drashna)

Christopher (Drashna)

    Customer and Technical Support

  • Administrators
  • 8,208 posts
  • LocationSan Diego, CA, USA

Posted 13 July 2017 - 07:31 PM

Waiting, generally. 

 

Unfortunate "Internal server errors" is exactly what it sounds like.    They're "HTTP 500" errors, which means that the issue is occurring entirely on the server side (google Drive, in this case). 

 

So, the only thing our software can do is "wait and retry", which really is the only thing you can do, as well. Unfortunately. 


Christopher Courtney

aka "Drashna"

Microsoft MVP for Windows Home Server 2009-2012

Lead Moderator for We Got Served

Moderator for Home Server Show

 

This is my server

 

Lots of "Other" data on your pool? Read about what it is here.


#5 srcrist

srcrist

    Advanced Member

  • Members
  • PipPipPip
  • 79 posts

Posted 14 July 2017 - 09:47 PM

 

I believe I did respond to this already.

 

 

Right. We talked in the support ticket.

 

 

 

As for re-indexing.... the only time this should happen is if the "chunk ID" database is lost.   If something happened to that file, then it would trigger this to occur. 

 

In my case it was triggered when the CloudDrive service crashed. But that's fine. I understand that it has to reindex. The very big problems comes when:

 

 

So, the only thing our software can do is "wait and retry", which really is the only thing you can do, as well. Unfortunately. 

 

 

 

The problem is that it does not retry during the indexing process. It starts over. That means you have to go through the entire indexing process without a single error or it simply restarts the process over again. I imagine that this does not *have* to be handled this way, but maybe I'm wrong. I think you suggested, in the support ticket, that this was something for Alex to maybe take a look at.

 

With very large drives, or drives that simply have a lot of chunks, this is very real and very frustrating issue, though. When your reindexing takes 8 or 9 hours, you can neither expect to go that entire time without even a minor server error--nor afford to start the process over when it's halfway done. It took 5 DAYS (no exaggeration at all) to get my largest drive remounted when this happened. Meanwhile, my other, smaller drives mounted just fine because they were able to completely reindex within the time in between server errors. Once the drives are mounted, these internal server errors do not cause downtime. CloudDrive simply tries the request again, Google complies (the errors are always temporary), and life moves on. 

 

But this problem during the reindexing process has to be fixed. Every hour someone has to go without any sort of server error whatsoever makes the process exponentially less likely to complete. Ever. It shouldn't take 5 days of crossing fingers just to get the drive to remount as it restarts over and over again. There needs to be more error tolerance than that.

 

 

 

@srcrist - this keeps happening to me today, did you managed to resolve it? 

 

 

As Christopher said, I didn't (and couldn't) do anything to resolve it. I did, however, eventually get the drive to remount. It just took a very long and very frustrating amount of time. I just had to wait until I didn't get any errors during the reindexing process.

 

 

 

In any case, this seems like a really critical problem to me right now, and I'm dreading the next time my drive gets uncleanly dismounted. Who knows how long it will take to get it back. There just isn't anything that can be done about the occasional bad server response from Google. So we've got to be able to work through those, rather than completely aborting the process and starting over again.



#6 dorncore

dorncore

    Newbie

  • Members
  • Pip
  • 2 posts

Posted 19 July 2017 - 02:03 PM

Any resolution for this issue? I've had mounting fail three times now because of a single error (no retries). Each attempt takes multiple hours. This is a major design flaw.



#7 Christopher (Drashna)

Christopher (Drashna)

    Customer and Technical Support

  • Administrators
  • 8,208 posts
  • LocationSan Diego, CA, USA

Posted 19 July 2017 - 09:59 PM

Alex is actually reworking some of the code to be more efficient, and this may help.

 

But I'm going to flag/reflag this issue, just in case.

 

https://stablebit.co...eAnalysis/27563


Christopher Courtney

aka "Drashna"

Microsoft MVP for Windows Home Server 2009-2012

Lead Moderator for We Got Served

Moderator for Home Server Show

 

This is my server

 

Lots of "Other" data on your pool? Read about what it is here.


#8 Edrock200

Edrock200

    Advanced Member

  • Members
  • PipPipPip
  • 36 posts

Posted 22 July 2017 - 05:28 PM

Just to add to this, from the other thread, I started manually stopping the service vice dismounting the drive prior to a pc reboot. Last night, the service shut down cleanly after a few minutes without any intervention. Unfortunately upon restart it began the chunk id rescan/rebuild and took several hours.

 

I've also recently received "Access Denied" when attempting to detach my drive on random occasions. I've been able to get around this by offlining the drive in windows disk manager, then detaching. Interestingly, one time when remounting after doing this, Windows (not cloud drive) mounted the drive as read only. I was able to use the diskpart utility to make it writable again.



#9 srcrist

srcrist

    Advanced Member

  • Members
  • PipPipPip
  • 79 posts

Posted 25 July 2017 - 11:40 AM

Christopher, I don't know if any of the recent beta changes were a part of the more efficient code, but I am stuck (again) in a mounting loop with beta 894. Server went down over 24 hours ago (OVH had some sort of issue) and it's still mounting over and over again due to internal server errors. I'd really rather this not take a week every time it happens. Waiting this long to get access to my data again is honestly rendering CloudDrive an unusable storage option. 



#10 srcrist

srcrist

    Advanced Member

  • Members
  • PipPipPip
  • 79 posts

Posted 25 July 2017 - 12:09 PM

I think that I'm going to drop to < 64TB drives pooled with DrivePool to try to mitigate this issue, but I still need to be able to get this drive mounted to copy the data. 



#11 dorncore

dorncore

    Newbie

  • Members
  • Pip
  • 2 posts

Posted 26 July 2017 - 03:16 AM

My ticket on this issue suggests that they still think the software attempts to retry after failures when indexing. It doesn't. This doesn't bode well for actually getting a fix.

#12 srcrist

srcrist

    Advanced Member

  • Members
  • PipPipPip
  • 79 posts

Posted 26 July 2017 - 04:35 PM

My ticket on this issue suggests that they still think the software attempts to retry after failures when indexing. It doesn't. This doesn't bode well for actually getting a fix.

 

 

I think I agree with you that there seems to be some confusion about what problem we are actually addressing here on the part of the Stablebit team. I tried to clarify in the other thread with some log snippets. Hopefully that will help.



#13 Christopher (Drashna)

Christopher (Drashna)

    Customer and Technical Support

  • Administrators
  • 8,208 posts
  • LocationSan Diego, CA, USA

Posted 27 July 2017 - 05:07 AM

I think that I'm going to drop to < 64TB drives pooled with DrivePool to try to mitigate this issue, but I still need to be able to get this drive mounted to copy the data. 

 

Okay.  Alex is busy working on code, ATM, and trying to get StableBit DrivePool "ready to ship a new version".  So it may be a bit before he can look into this.  But the issue is flagged as important, so it does have higher priority, so he should get to it sooner rather than later. 

 

 

And please do let me know if the smaller size helps.

 

My ticket on this issue suggests that they still think the software attempts to retry after failures when indexing. It doesn't. This doesn't bode well for actually getting a fix.

 

I'm not suggesting, I'm stating:

"CloudFsDisk_MaximumConsecutiveIoFailures".

http://wiki.covecube...vanced_Settings

However, the limit is for a 120 second window. So it may be erroring out too much for your system.

 

You can increase this value, but it can cause serious issues, by doing so.  up to and including causing the system to lock up, if too many errors are occurring. 

 

 

However, the issue has been flagged for review, and we will look into it, because there does seem to be an issue here, and you're not the only one seeing it. Though, it does seem to be pretty rare.

 

But we'd rather address the issue directly, rather than using stop gap measures (above), because it just covers up the issue... rather than fixing it. 


Christopher Courtney

aka "Drashna"

Microsoft MVP for Windows Home Server 2009-2012

Lead Moderator for We Got Served

Moderator for Home Server Show

 

This is my server

 

Lots of "Other" data on your pool? Read about what it is here.


#14 srcrist

srcrist

    Advanced Member

  • Members
  • PipPipPip
  • 79 posts

Posted 27 July 2017 - 02:19 PM

Okay.  Alex is busy working on code, ATM, and trying to get StableBit DrivePool "ready to ship a new version".  So it may be a bit before he can look into this.  But the issue is flagged as important, so it does have higher priority, so he should get to it sooner rather than later. 

 

 

And please do let me know if the smaller size helps.

 

 

 

Yeah, no worries. I have the drive mounted again right now. Google has just been exceptionally stable the last few days and I've been able to get it to remount with maybe one or two restarts (so about 24 hours).

 

As far as the lower drive sizes, now that I've realized chkdsk can't be used on anything larger, this is probably best for NTFS drives anyway--assuming I care about the data longer-term. Drivepool is in the process of migrating the data now, but it's going to take weeks. 



#15 Christopher (Drashna)

Christopher (Drashna)

    Customer and Technical Support

  • Administrators
  • 8,208 posts
  • LocationSan Diego, CA, USA

Posted 09 September 2017 - 02:44 AM

Well, the latest beta should include better handling for the file/chunk enumeration failures. Eg, it should retry it, and do so gracefully.


Christopher Courtney

aka "Drashna"

Microsoft MVP for Windows Home Server 2009-2012

Lead Moderator for We Got Served

Moderator for Home Server Show

 

This is my server

 

Lots of "Other" data on your pool? Read about what it is here.


#16 srcrist

srcrist

    Advanced Member

  • Members
  • PipPipPip
  • 79 posts

Posted 10 September 2017 - 07:17 AM

That's great, Christopher. Honestly, though, the efficiency changes have also done wonders to simply make sure that it doesn't have to enumerate every time there is an unclean shutdown, as well. So I actually haven't have to work around this issue since those changes several weeks ago. Good news, in nay case. 


  • Christopher (Drashna) likes this

#17 Christopher (Drashna)

Christopher (Drashna)

    Customer and Technical Support

  • Administrators
  • 8,208 posts
  • LocationSan Diego, CA, USA

Posted 10 September 2017 - 08:07 PM

And I know we've done some changes to help make sure that unsafe shutdowns don't happen, unless... they actually happen (eg, power loss, hard reset, etc).  

 

And I believe that Alex upped the number of files we enumerate at one time (at least on Google Drive), so it should use less API calls when indexing.  And that should help reduce the likelihood of issues, as well.

 

 

But I'm glad to hear that all the work has definitely created tangible results! And positive ones, too. 


Christopher Courtney

aka "Drashna"

Microsoft MVP for Windows Home Server 2009-2012

Lead Moderator for We Got Served

Moderator for Home Server Show

 

This is my server

 

Lots of "Other" data on your pool? Read about what it is here.


#18 triadcool

triadcool

    Advanced Member

  • Members
  • PipPipPip
  • 87 posts

Posted 18 September 2017 - 05:42 PM

Christopher,

 

I am still having issues with CloudDrive re-indexing each of my 6 CloudDrives every time I restart my server and this process still takes over a day to complete for all of the drives.  Nothing is happening to the Chunk Database as I can browse to Google Drive and view the file for each drive in the data-ChunkIdStorage folder.  Does the file have to exist somewhere on the computer running CloudDrive for it to not trigger another scan at startup?

 

I am currently running on version .929 but the same thing has happened on every version I have tried since the chunk database was implemented. 



#19 Christopher (Drashna)

Christopher (Drashna)

    Customer and Technical Support

  • Administrators
  • 8,208 posts
  • LocationSan Diego, CA, USA

Posted 18 September 2017 - 08:24 PM

When mounting the drive, it should download the database from the provider. 

This is then stored in "C:\ProgramData\StableBit CloudDrive\Service\Db\ChunkIds", and used there.

 

And if it regenerates the database, it should store it there, as well. 

 

 

Worst case here, stop the service, delete the contents of the folder and restart the service.  This ... will cause it to reindex the drives again, but hopefully, this may be the last time. 

 

If not, then run the Troubleshooter:

http://wiki.covecube..._Troubleshooter

 

It may also be a good idea to wait til it's done indexing, enable file system logging, reboot the system, and run the Troubleshooter.

And then.... wait till it's done indexing, enable boot logging, reboot and run the Troubleshooter


Christopher Courtney

aka "Drashna"

Microsoft MVP for Windows Home Server 2009-2012

Lead Moderator for We Got Served

Moderator for Home Server Show

 

This is my server

 

Lots of "Other" data on your pool? Read about what it is here.


#20 triadcool

triadcool

    Advanced Member

  • Members
  • PipPipPip
  • 87 posts

Posted 18 September 2017 - 11:53 PM

It does the reindexing no matter what computer I am testing stablebit on so there is definitely something going on here.  Clouddrive will not use the indexing database that is already uploaded to google.  Surely the others in this thread are still having this issue?






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users