Jump to content
Covecube Inc.
  • 0

I/O deadlock?


dragon2611
 Share

Question

Is it possible to get into a situation where the I/O seems to deadlock, I have a 2012r2 machine I was copying 300GB or so of data to onedrive Business including lots of small files,

It Looks like the copying has stopped but also I cannot bring up task manager and a shutdown -r -f -t 30 command is hanging.  Hoping that it's cloud drive and not one of the disks in the box is failing.

Link to comment
Share on other sites

Recommended Posts

  • 0

Was this a clean installation or an upgrade?

And if was an upgrade, to what version and from which version?

 

Okay. Could you uninstall the current version, and make sure that "C:\Program Files\StableBit\CloudDrive" is empty. 

If it's not, make sure you manually stop the services for it in "services.msc", use task manager to "end" the notification app, delete the directory, reboot and then reinstall?

 

Sorry Avast thought some of cloud-drives files were win32:Evo-Gen and decided to quarantine them, because it's a headless machine I didn't change the defaults (I.e act automatically) although it should have popped up an alert upon quarantining anything

I didn't see that it had done it until I got a notification earlier today that a threat had been blocked and logged into their management portal.

 

Edit:

Flagged that file as a false positive, temporarily white listed it and re-installed (with avast disabled when installing)

 

I can't seem to open the settings to enable encryption on Onedrive, I tick the box (which doesn't change) and the settings button lights up but clicking it does nothing.

Link to comment
Share on other sites

  • 0

That would do it .....

 

 

And we recommend adding "C:\Program Files\StableBit" and "C:\Program Files (x86)\StableBit" (if using Scanner) to the excluded paths in Avast, and other AV solutions, to prevent this sort of issue.

 

 

I'm not that keen on doing that unless I really have to because obviously if one was to start excluding directories left right and centre for various apps it would make it even more trivial to get a virus onto a machine just by putting it into one of those directories.

 

That said if it's a targeted attack Antivirus is pretty useless anyway...

 

Edit:

Any idea about the encryption settings not opening?

Link to comment
Share on other sites

  • 0

I'm not that keen on doing that unless I really have to because obviously if one was to start excluding directories left right and centre for various apps it would make it even more trivial to get a virus onto a machine just by putting it into one of those directories.

 

That said if it's a targeted attack Antivirus is pretty useless anyway...

 

Edit:

Any idea about the encryption settings not opening?

Very true.  The problem here is that Avast frequently flags one or more of our files as malicious (based on a file hash match, and not on any actual scan of the file, which is beyond lazy).  In fact, most of the time, when this happen, it's un-obfuscated files (eg, they can be read very easily).

 

 

 

As for the encryption issue, uninstalling StableBit CloudDrive completely. Make sure there are no files in "C:\Program Files\StableBit\CloudDrive" (if there are, delete them). Then run the MS Fixit program. Reboot, and then reinstall.

The issue is most likely that it has "mismatched" file versions. 

Link to comment
Share on other sites

  • 0
Okay. Could you uninstall the current version, and make sure that "C:\Program Files\StableBit\CloudDrive" is empty. 

If it's not, make sure you manually stop the services for it in "services.msc", use task manager to "end" the notification app, delete the directory, reboot and then reinstall?

 

Done. FWIW none of the file modified times changed after doing this, so I assume it had previously updated successfully. Will try my best to crash it again!

 

*edit* It just locked up - again while drivepool was duplicating to the clouddrive. Done a manual dump again and can upload it if it'll help.

Link to comment
Share on other sites

  • 0

 

 

Just had the 'deadlock' lockup again. Happened while drivepool was duplicating files to the clouddrive. Will upload a dump shortly.

 

*edit* Uploaded here. Drive is doing recovery right now.

Okay, flagged the dump for Alex. 

 

In the meanwhile, could you do this, as well:

Okay, when you open the UI, what version does it report (at the bottom of the window)? And what version does it report in the Control Panel where you go to uninstall programs?

 

And could you run "fltmc" from a command prompt and post  it's output?

Link to comment
Share on other sites

  • 0

Okay, flagged the dump for Alex. 

 

In the meanwhile, could you do this, as well:

 

And could you run "fltmc" from a command prompt and post  it's output?

 

It's reporting version 1.0.0.330 BETA in the UI and 1.0.330 in control panel's 'programs and features'.

C:\WINDOWS\system32>fltmc

Filter Name                     Num Instances    Altitude    Frame
------------------------------  -------------  ------------  -----
eamonm                                  8       328700         0
luafv                                   1       135000         0
npsvctrig                               1        46000         0
FileInfo                                8        45000         0
Wof                                     0        40700         0

Shall I upload today's dump too? Also, would logfiles help at all?

Link to comment
Share on other sites

  • 0

Okay, I've pinged Alex directly. 

 

As for the re-attaching issue, the latest build should fix that. (1.0.0.332)

Having some licensing issues with the latest release. 30 Day trial not only didn't restart, the upgrade also wiped out the existing time on the trial. See attached screenshot.

HzP7wAr.png

Link to comment
Share on other sites

  • 0

Having some licensing issues with the latest release. 30 Day trial not only didn't restart, the upgrade also wiped out the existing time on the trial. See attached screenshot.

 

 

 

That build fixes the encryption window not popping up but like Vrytired the trial has 0 days instead of 30.

 

 

There was a change to the licensing code in build 331, it may have caused this.

I've flagged the issue for Alex, and let him know that we need to issue you guys a trial extension (which will be emailed to you).

https://stablebit.com/Admin/IssueAnalysis/18770

Link to comment
Share on other sites

  • 0

There was a change to the licensing code in build 331, it may have caused this.

I've flagged the issue for Alex, and let him know that we need to issue you guys a trial extension (which will be emailed to you).

https://stablebit.com/Admin/IssueAnalysis/18770

Should be fixed in the 1.0.0.333 build.

Link to comment
Share on other sites

  • 0

After updating to .332, and then .333, Im getting red 'is not authorized with Amazon Cloud Drive' messages. Clicking 'reauthorize' takes me to the cloudbit site with no access key to copy. 'An+unknown+scope+was+requested' appears in the URL.

Same. It would appear that Amazon has not yet whitelisted this version.

Link to comment
Share on other sites

  • 0

After updating to .332, and then .333, Im getting red 'is not authorized with Amazon Cloud Drive' messages. Clicking 'reauthorize' takes me to the cloudbit site with no access key to copy. 'An+unknown+scope+was+requested' appears in the URL.

Unfortunately, we are aware of the issue.

See below.

 

Same. It would appear that Amazon has not yet whitelisted this version.

Version has nothing to do with it fortunately/unfortunately.

The issue is the API/app identification code used by our product. 

 

Specifically, the documentation is pretty bad about the proper procedure for development and production use. 

First, you have to get authorized for development use. This is rather easy, and we can do this for new "apps" if needed. However, this will continue to suffer the issues as it goes into use more and more (the current issue). 

But Amazon's documentation about production use is a single line or two, poorly stated. We missed this (sorry!) until somebody mentioned this (so we're clearly not the only ones that are having this issue with Amazon...).  Additionally the portal for managing this is actually having issues on Amazon's end, as well. Making this even more of a pain to get resolved.  We've emailed Amazon to get this resolved as soon as possible as we want this working, too.

 

What does this mean for you? That Amazon Cloud Drive is and will remain horribly throttled until we're properly authorized for production use.  When that happens, you should see ... well, even better speeds than you were with Amazon (as the development account is apparently throttled).

 

I've not had an IO related lockup yet when uploading to Onedrive business.

 

Quite a few I/O errors being logged during the upload but I'm guessing that's just them being crap 

I'm glad to hear that OneDrive for Business is working well.

 

And yes, there will be errors and throttling, because Microsoft is basically using SharePoint as the backend here... and if you're familiar with sharepoint... well, I'm sorry to hear that.

Link to comment
Share on other sites

  • 0

FWIW .336 has fixed the red auth errors.

 

Thanks for explaining the situation with Amazon Cloud Drive. Hopefully Amazon don't take too long in reviewing your application. If anyone else is curious, there is a bit more info here. Is there a possibility that once this is sorted, upload verification will no longer be needed?

 

Back on topic, I had another lockup this morning. There wasn't any specific read/write activity happening on the disk at the time that I was aware of, although it had been throwing those red errors all night.

Link to comment
Share on other sites

  • 0

FWIW .336 has fixed the red auth errors.

 

Thanks for explaining the situation with Amazon Cloud Drive. Hopefully Amazon don't take too long in reviewing your application. If anyone else is curious, there is a bit more info here. Is there a possibility that once this is sorted, upload verification will no longer be needed?

Glad to hear it.

 

And yeah, hopefully Amazon enables the production "settings" for us soon.

As for the upload verification, I don't know.  It's enabled by default because of issues we have seen from it. It's entirely possible that these issues are related to the development status.  We will definitely investigate that, and change the settings as appropriate. 

 

 

 

Back on topic, I had another lockup this morning. There wasn't any specific read/write activity happening on the disk at the time that I was aware of, although it had been throwing those red errors all night.

I've already let Alex know about your continued issues. It may be something unrelated to the cache issue, but not sure.

 

Either way, I would highly recommend running a memory test, in this case. (Just to be sure, as this entire process is very memory and disk intensive).

http://wiki.covecube.com/StableBit_DrivePool_Q537229

Link to comment
Share on other sites

  • 0

Windows Memory test thing reports no errors. I've just ran memtest86 too just to be certain, and that hasn't found anything wrong either. I've had the lockup on two separate machines, so I'd be surprised if it was a hardware issue.

 

I've only seen it occur during cloud disk i/o - it's possible something was accessing the disk when it crashed this morning too, though as it was throwing constant red auth errors it could have been something else. It went several weeks without occurring when the disk was just sitting there attached but idle. The last 3 times I've tried adding the cloud disk to drivepool, its locked up within 5-10mins of duplication beginning.

 

*edit* I just reproduced it on another machine. Made a new drivepool+cloud disk on it. Copied ~20GB of files to the drivepool disk, then added the clouddisk to the pool. Locked up at about 80% duplication. Was still able to move open windows around just like the lock up on the other machine, but wasn't able to do much else.

Link to comment
Share on other sites

  • 0

I've been fiddling around this afternoon and I can now semi-reliably reproduce it. If it doesn't lock up first time, then just keep doing it until it does. I don't think using drivepool is relevant, but its what I've used, and it does induce plenty of i/o.

 

1. Created a new single disk drive pool.

2. Copied ~20GB worth of files across (10K files - mix of large and small)

3. Created a cloud drive. I've had it crashing on Amazon Cloud Drives, File Shares, and Local Disks. I'm not sure if the provider is relevant, but I've used a 1GB cache / 100GB drive for my tests. Enabled full drive encryption (again, not sure if relevant).

4. Added the cloud drive to the drive pool

5. Enabled duplication on everything.

6. Clicked the 'increase priority' button - not sure if this is relevant, but I'd it was clicked for the last 3 crashes.

 

Then cross your fingers and hope it locks up. If not, remove the cloud drive from the pool, and re-add it to restart the duplication process. This has caused a lock up the past 3/5 times I've tried. Give it a go and see if you guys can reproduce it.

 

*edit* Full drive encryption isn't relevant, nor is local cache size. It still locked up with a local disk/no cache/no encryption on the second attempt at duplicating. Task manager (I couldn't open it in Windows 8.1 once its locked up, but can in 10) shows the cloud drive as 100% active time, 0b/s read, 0b/s write once its locked up.

Edited by thnz
Link to comment
Share on other sites

  • 0

Windows Memory test thing reports no errors. I've just ran memtest86 too just to be certain, and that hasn't found anything wrong either. I've had the lockup on two separate machines, so I'd be surprised if it was a hardware issue.

 

I've only seen it occur during cloud disk i/o - it's possible something was accessing the disk when it crashed this morning too, though as it was throwing constant red auth errors it could have been something else. It went several weeks without occurring when the disk was just sitting there attached but idle. The last 3 times I've tried adding the cloud disk to drivepool, its locked up within 5-10mins of duplication beginning.

 

*edit* I just reproduced it on another machine. Made a new drivepool+cloud disk on it. Copied ~20GB of files to the drivepool disk, then added the clouddisk to the pool. Locked up at about 80% duplication. Was still able to move open windows around just like the lock up on the other machine, but wasn't able to do much else.

Well, I was able to trigger probably the same thing on a VM, so it's definitely not isolated (eg, not just you), so the issue definitely isn't fixed.

 

Though, having reproduced it definitely makes it easier to identify and track down.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share

×
×
  • Create New...