Jump to content
  • 0

I/O deadlock?


dragon2611

Question

Is it possible to get into a situation where the I/O seems to deadlock, I have a 2012r2 machine I was copying 300GB or so of data to onedrive Business including lots of small files,

It Looks like the copying has stopped but also I cannot bring up task manager and a shutdown -r -f -t 30 command is hanging.  Hoping that it's cloud drive and not one of the disks in the box is failing.

Link to comment
Share on other sites

Recommended Posts

  • 0

It's great that you guys have (hopefully!) reproduced it. I see a note in the changelog mentioning deadlock resolution code. Is that for this issue? If so, is it worth testing yet?

Yes, that's what it's referring to.

 

It may help, but it may not. 

I've experienced even more issues with it, so I'm not sure it's "fixed" yet. Unfortunately. 

Link to comment
Share on other sites

  • 0

It BSOD'd twice since .352. The first time I didn't catch the message as it had restarted itself, and I didn't keep the dump. It happened during drivepool duplication (so, heavy i/o). The second time was a clock_watchdog_timeout - I'm not sure under what circumstances it happend (i/o etc.), as I had just came back to the machine and it was sitting on a BSOD (I had disabled auto-restart on crash after the first one so I could see what it was). I'm compressing the second dump atm and will upload it shortly.

Link to comment
Share on other sites

  • 0

Not sure if it's related but I had an I/O lockup and after rebooting onedrive wanted to re-auth.

Does the authentication expire every so often/after a certain number of requests?

It should be handled by OAuth, IIRC. But, no it shouldn't be re-requesting.

 

 

However, that said, we do think we've finally tackled the Deadlock issue. At least, we're not able to reproduce it anymore (or any of the weird BSODs).  

I've tested it on the VMs that I was able to reproduce it reliably on, and it hasn't occurred.

 

So, if you're interested, it's the 1.0.0.370 build.

32-Bit: http://dl.covecube.com/CloudDriveWindows/beta/download/StableBit.CloudDrive_1.0.0.370_x86_BETA.exe

64-Bit: http://dl.covecube.com/CloudDriveWindows/beta/download/StableBit.CloudDrive_1.0.0.370_x64_BETA.exe

Link to comment
Share on other sites

  • 0

That's great news. I'll grab that build later and put it through its paces - will do my best to crash it!

 

Just need to pass the Amazon review process then, and I'll be able to ditch CrashPlan for DrivePoolDuplication+CloudDrive.

Well, I sincerely hope that you don't encounter any issues (but if you do, that they're reproducible).

 

 

And yeah, hopefully we get approval from Amazon soon. This stuck in "limbo" without an answer really sucks. 

Link to comment
Share on other sites

  • 0

It's been duplicating all afternoon without any crashes, so its looking promising.

 

Are there any plans for further integration with DrivePool? For instance, to force duplications to the cloud disk? At the moment it needs a fair bit of micromanagement, individually assigning rules to folders as to which disk they should be on. When theres more than two disks in the pool, its a bit fiddly forcing them on to the cloud disk.

Link to comment
Share on other sites

  • 0

It's been duplicating all afternoon without any crashes, so its looking promising.

 

Are there any plans for further integration with DrivePool? For instance, to force duplications to the cloud disk? At the moment it needs a fair bit of micromanagement, individually assigning rules to folders as to which disk they should be on. When theres more than two disks in the pool, its a bit fiddly forcing them on to the cloud disk.

Fantastic! (I'm not sure you may realize how excited it is for us to hear that! Seriously!)

 

 

As for the integration and forcing files, we do have file placement rules. This was added to StableBit DrivePool a while ago, in prep for StableBit CloudDrive.

http://stablebit.com/Support/DrivePool/2.X/Manual?Section=File%20Placement

However, "grouping" has been suggested recently, so that we can specify groups of disks to contain a duplicate (so you could select the CloudDrives as one group, and the local disks as another group).  This has been discussed, and we DEFINITELY do want to add this feature in the near future.  

However, the deadlock issue gets priority, for obvious reasons!

Link to comment
Share on other sites

  • 0

Still running smoothly after 24 hours. It's looking like this issue could well be resolved.

 

It's thrown a few red 'error downloading from Amazon Cloud Drive' errors - throttling I guess - although the message mentions possible data stability issues if it keeps occuring. How will DrivePool respsond to this kind of thing? If the internet connection goes down, will DrivePool handle read errors gracefully? As its all duplicated data I would hope so. I'm assuming reads would be done with the local drive copy, and writes would be cached for later upload.

Link to comment
Share on other sites

  • 0

Disk grouping sounds ideal.

 

Previously I found the i/o deadlock seemed to kick in after maybe 5mins or so, so having gone several hours without issue is certainly promising. Fingers crossed I don't wake up to a BSOD tomorrow morning! Will keep you guys updated.

Yeah, it does. Though, considering that the balancing engine is pretty complicated already... it may be take a while to implement it.

 

I'll try and upgrade within the next couple of days, I had hoped to do it tonight but I ended up troubleshooting an issue with my internet connection instead.  :(

Well, I'm sorry to hear about the internet connectivity issues. :(

 

Still running smoothly after 24 hours. It's looking like this issue could well be resolved.

 

It's thrown a few red 'error downloading from Amazon Cloud Drive' errors - throttling I guess - although the message mentions possible data stability issues if it keeps occuring. How will DrivePool respsond to this kind of thing? If the internet connection goes down, will DrivePool handle read errors gracefully? As its all duplicated data I would hope so. I'm assuming reads would be done with the local drive copy, and writes would be cached for later upload.

And I'm really glad to hear that!

 

And yeah, Amazon is throttling the connection pretty bad, but that shouldn't' be an issue after we get approved for production use... whenever that happens....

And any data that fails to upload is retried.  And specifically, for Amazon Cloud Drive, we use "Upload verification", which redownloads the data after it's been uploaded to make sure that the upload process occurred properly. And once that's verified, then we may prune that data from the cache.

Link to comment
Share on other sites

  • 0

Well, I'm sorry to hear about the internet connectivity issues. :(

 

Thankfully It was solved pretty quickly after calling the ISP, (Something caused my PPP connection to drop and then somehow get stuck in a state where it had dropped my end but the network thought I still had an active PPP session and wouldn't allow me to start a new one.)

 

worst case I've got a 2nd connection with another ISP, it's just it's a fair bit slower and by the time I was done dealing with that I needed to goto bed to get up early for work.

Anyway I'll install the new build now.

Link to comment
Share on other sites

  • 0

Thankfully It was solved pretty quickly after calling the ISP, (Something caused my PPP connection to drop and then somehow get stuck in a state where it had dropped my end but the network thought I still had an active PPP session and wouldn't allow me to start a new one.)

 

worst case I've got a 2nd connection with another ISP, it's just it's a fair bit slower and by the time I was done dealing with that I needed to goto bed to get up early for work.

Anyway I'll install the new build now.

Well, I'm glad to hear that they were able to resolve it fairly quick!

 

And hopefully, no issues with the new build.

Link to comment
Share on other sites

  • 0

So far so good although windows wanted me to chkdisk the drive, probably from where the previous build had deadlocked.

If you rebooted it or the system marked it as "dirty", then yeah, it would want to run a chkdsk pass on it. And that would fix any issues with the data on it, actually. :)

 

Running .371 here, I concur with the other posters, deadlock issue seems to be resolved.

Glad to hear it!

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...