Jump to content
  • 0

Drivepool freezing Explorer, UI not loading


vnangia

Question

Hello. Apologies if this has been asked before, but I can't find it using search either here or on the FAQ page.

 

I had DrivePool 2.1.1.561_x64 running a 10-disk pool on Windows Server 2012 R2 Essentials. For the past 48 hours, the pool is simply not responding as far as I can tell. Explorer shows no information in the "My Computer" about the poolsize or available space. Attempts to access the pool meet with immediate Explorer freeze. The Drivepool UI is not loading at all. Disk Management freezes connecting to the virtual disk service. Neither processor nor RAM is under load, nor are any of the processes marked as hung. It's also not possible to stop or restart the Service. A quick check through the logs shows nothing untoward, except "Cannot calculate balance ratio, pool is not measured. (Pool mode: PoolModeNormal)" in the logs.

 

Does any one have any suggestions how to fix this? I have tried:

-Restarting the computer, several times.

-Uninstalling and reinstalling DrivePool.

-Starting in Safe Mode.

 

Bit at my wits end here.

 

Sorry, edited to add: I previously had the SSD Optimizer plugin installed, but no longer as there's no longer a caching drive. The pool hasn't changed at all in setup since the initial configuration, except for the removal of the caching drive. No Windows updates in the last 3-4 weeks.

Link to comment
Share on other sites

14 answers to this question

Recommended Posts

  • 0

You posted a ticket about this already, and I've responded to that.

 

However, is this Explorer that is freezing specifically, or the entire system?

 

Either way, could you do this:

http://wiki.covecube.com/StableBit_DrivePool_Q2159701

And do you have any antivirus loaded on the system?

If you do have antivirus loaded, try disabling/uninstalling it temporarily, and see if that helps.

 

If it's explorer that is crashing/freezing, could you do this:

http://wiki.covecube.com/StableBit_DrivePool_2.x_Log_Collection

Reboot the system, do that, and reproduce the issue.

 

If it's the entire system freezing up, do this:

http://wiki.covecube.com/StableBit_DrivePool_System_Freeze

 

 

 

Worst case, try using this build, and see if it helps:

http://dl.covecube.com/DrivePoolWindows/beta/download/StableBit.DrivePool_2.2.0.610_x64_BETA.exe

Link to comment
Share on other sites

  • 0

I"m curious as to the final culprit for this problem and how the problem was diagnosed.   I had this issue happen last night and it turned out that a drive that I new was going bad finally was problematic enough to make the entire pool (and ultimately the system) unresponsive as described here.   I was lucky in that I knew to check that drive first and remove it, but I wonder what I would have done if I had not known that....

Link to comment
Share on other sites

  • 0

Hi Christopher, yes, same ticket - thanks for putting it in here.

 

rtech73, the TL;DR is there was a curious entry in the system log that read "Reset to device, \Device\RaidPort0, was issued." Googling that got this article, and when I went to check, sure enough that's what happened.

 

More detail: I'm mainly a *nix guy, and have used OS X as my primary OS since uhh ... 10.1? Maybe the DP? A while at any rate. The reason I have a Windows Server is to support the Windows-based HTPC which in turn exists only because I have to have a cablecard. So a number of things which I guess is normal in Windows world never occured to me. The obvious symptom was that the pool was freezing up, so I posted here and then saw that I was eligible to submit a ticket, so I did. Some rounds of troubleshooting later, Christopher mentioned checking the system log (which I forgot Windows had, honestly - "Event Viewer" doesn't immediately translate to "log" to an OS X / Unix guy). And then the rest fell into place - I know that the power went the day the errors began but the system is on a UPS, so naturally it switched over and there were no FS errors. What I didn't realize it also apparently switched power plans; again to a Unix person this is weird behavior because a) why should there be power plans; and B) why would a server of all things have power plans - it's a desktop server, not a laptop. The standard power plan has the PCI Express Link State Power Management option to "off"; the "balanced" plan has it to "moderate". And when I switched it, it suddenly started functioning again, so I asked Christopher to close the ticket.

 

Honestly, this might have been the straw that broke the camel's back. I think I'm going to use VMWare or Parallels to virtualize a Win7 instance for the HTPC on a Mac Mini and retire the Windows Server, though WAF needs to be checked :)

Link to comment
Share on other sites

  • 0

I"m curious as to the final culprit for this problem and how the problem was diagnosed.   I had this issue happen last night and it turned out that a drive that I new was going bad finally was problematic enough to make the entire pool (and ultimately the system) unresponsive as described here.   I was lucky in that I knew to check that drive first and remove it, but I wonder what I would have done if I had not known that....

the "electric" failure of your disk is a very hard thing to diagnose. Other than keen observation/deduction, I'm not sure there is a way to "detect" that... aside from trial and error (aka, pulling out one drive at a time .... which I've had to do before :( ). 

 

 

 

Hi Christopher, yes, same ticket - thanks for putting it in here.

 

rtech73, the TL;DR is there was a curious entry in the system log that read "Reset to device, \Device\RaidPort0, was issued." Googling that got this article, and when I went to check, sure enough that's what happened.

 

More detail: I'm mainly a *nix guy, and have used OS X as my primary OS since uhh ... 10.1? Maybe the DP? A while at any rate. The reason I have a Windows Server is to support the Windows-based HTPC which in turn exists only because I have to have a cablecard. So a number of things which I guess is normal in Windows world never occured to me. The obvious symptom was that the pool was freezing up, so I posted here and then saw that I was eligible to submit a ticket, so I did. Some rounds of troubleshooting later, Christopher mentioned checking the system log (which I forgot Windows had, honestly - "Event Viewer" doesn't immediately translate to "log" to an OS X / Unix guy). And then the rest fell into place - I know that the power went the day the errors began but the system is on a UPS, so naturally it switched over and there were no FS errors. What I didn't realize it also apparently switched power plans; again to a Unix person this is weird behavior because a) why should there be power plans; and B) why would a server of all things have power plans - it's a desktop server, not a laptop. The standard power plan has the PCI Express Link State Power Management option to "off"; the "balanced" plan has it to "moderate". And when I switched it, it suddenly started functioning again, so I asked Christopher to close the ticket.

 

Honestly, this might have been the straw that broke the camel's back. I think I'm going to use VMWare or Parallels to virtualize a Win7 instance for the HTPC on a Mac Mini and retire the Windows Server, though WAF needs to be checked :)

Well, first, anyone is "eligible" to submit a ticket. We don't care if you have a license or not. We just want to help make sure everything is working well! 

 

As for the cableCard .... I'm sorry to hear that. I used to do the PVR stuff, so I definitely understand. :(

However, IIRC, the HD HomeRun Prime is a network based cableCard tuner, and I ... think... it uses UPNP/DLNA for the live TV streams. May be worth checking into. It could get you off of Windows entirely. 

 

In fact, have you checked out MediaBrowser (now "Emby")? It supports a number of PVR solutions, as well as an extensive download library. :) 

It may be just what you are looking for, and may let you get away from Windows entirely, if you so desire.

 

 

As for troubleshooting, unless the application has specific logs (and we do), the Event Viewer on windows is an incredibly helpful diagnostic tool! It is definitely the "go to" thing for figuring out issues. And there is a LOT you can do with it. :)

 

 

As for the power plans, I do believe I said this already, but ... it really shouldn't have switched. It's very odd that it did. Because each power plan has a "on battery" and "plugged in" options. 

As for the different plans... well, if you look into them, they have different times... and most importantly, check the "Processor power management". That actually makes a big difference, as this option can (does) throttle the CPU based on the system status (plugged in or not), and can extend battery life and reduce power consumption.

Link to comment
Share on other sites

  • 0

the "electric" failure of your disk is a very hard thing to diagnose. Other than keen observation/deduction, I'm not sure there is a way to "detect" that... aside from trial and error (aka, pulling out one drive at a time .... which I've had to do before :( ). 

 

 

 

Well, first, anyone is "eligible" to submit a ticket. We don't care if you have a license or not. We just want to help make sure everything is working well! 

 

As for the cableCard .... I'm sorry to hear that. I used to do the PVR stuff, so I definitely understand. :(

However, IIRC, the HD HomeRun Prime is a network based cableCard tuner, and I ... think... it uses UPNP/DLNA for the live TV streams. May be worth checking into. It could get you off of Windows entirely. 

 

In fact, have you checked out MediaBrowser (now "Emby")? It supports a number of PVR solutions, as well as an extensive download library. :)

It may be just what you are looking for, and may let you get away from Windows entirely, if you so desire.

 

 

As for troubleshooting, unless the application has specific logs (and we do), the Event Viewer on windows is an incredibly helpful diagnostic tool! It is definitely the "go to" thing for figuring out issues. And there is a LOT you can do with it. :)

 

 

As for the power plans, I do believe I said this already, but ... it really shouldn't have switched. It's very odd that it did. Because each power plan has a "on battery" and "plugged in" options. 

As for the different plans... well, if you look into them, they have different times... and most importantly, check the "Processor power management". That actually makes a big difference, as this option can (does) throttle the CPU based on the system status (plugged in or not), and can extend battery life and reduce power consumption.

 

Actually - I've generally found catastrophic disk failure the easiest thing to diagnose - you can usually hear it clucking or making a weird noise. What keeps me up at night is silent corruption / bitrot. That's in large part why I'm thinking about using ZFS - I'd use btrfs, but the community around btrfs is a bit terrifying, while OpenZFS on OS X is quite supportive and welcoming. In fact, I actually have an HP EX490 that's been Hackintoshed running ZFS right now so it wasn't the end of the world when the server went down. Hooray rsync/BTSync.

 

We used to use a HD HomeRun Prime for several years, until everything ended up being on Sunday night and the wife yelled when she couldn't record all her shows. So we now have a Ceton Infinitv 6 ETH, but decrypting HDCP-encrypted broadcasts is still impossible on any OS other than Windows. We also made a commitment to basically cut our stuff quite a bit and go digital several years ago, and have done so. I've encoded almost every disc I have and it's served by Plex, a cousin of Emby, so we're quite happy with the setup right now - that's the vast majority of stuff on the Drivepool right now.

 

And yeah, I know what you said about the power plans not switching, and yet, I can consistently reproduce the problem by yanking the UPS out of the socket. It switches almost instantly to the Balanced plan from High Power, but never back when plug it back in. Maybe it's the UPS software? I think I should just remove the Balanced plan... Anyway.

Link to comment
Share on other sites

  • 0

Actually - I've generally found catastrophic disk failure the easiest thing to diagnose - you can usually hear it clucking or making a weird noise. What keeps me up at night is silent corruption / bitrot. That's in large part why I'm thinking about using ZFS - I'd use btrfs, but the community around btrfs is a bit terrifying, while OpenZFS on OS X is quite supportive and welcoming. In fact, I actually have an HP EX490 that's been Hackintoshed running ZFS right now so it wasn't the end of the world when the server went down. Hooray rsync/BTSync.

What sort of bit rot are you referring to?

The physical medium degrading (to which StableBit Scanner detects, specifically)?

Or to the random bit flip "caused by cosmic rays"? I say this in quotation marks, because the actual probability of this happening .... well, you're will the lotto first, most likely.

 

Also, modern drives to a LOT of error detection and correction. And they do this invisibilty and silently, so you never see it happening. The above may be common in older drives, but anything remotely modern (eg, using SATA, and even a bit before that) is not very prone to these issues.

 

We used to use a HD HomeRun Prime for several years, until everything ended up being on Sunday night and the wife yelled when she couldn't record all her shows. So we now have a Ceton Infinitv 6 ETH, but decrypting HDCP-encrypted broadcasts is still impossible on any OS other than Windows. We also made a commitment to basically cut our stuff quite a bit and go digital several years ago, and have done so. I've encoded almost every disc I have and it's served by Plex, a cousin of Emby, so we're quite happy with the setup right now - that's the vast majority of stuff on the Drivepool right now.

I'm definitley familar with Plex. I was using it recently, until trying MediaBrowser/Emby again. Emby works a lot better for my usage. But both are good programs!

As for the TV tuner stuff, yup. :(

It really sucks when everything is on at the same time. 

 

And yeah, I know what you said about the power plans not switching, and yet, I can consistently reproduce the problem by yanking the UPS out of the socket. It switches almost instantly to the Balanced plan from High Power, but never back when plug it back in. Maybe it's the UPS software? I think I should just remove the Balanced plan... Anyway.

Do you have the manufacturer's software installed? If so, then that could definitely be the cause. It may even be intentional, based on it's settings. Worth checking out, and digging into.

Personally, I just use the default Windows management for batteries. It works really well, for the most part.

Link to comment
Share on other sites

  • 0

What sort of bit rot are you referring to?

The physical medium degrading (to which StableBit Scanner detects, specifically)?

Or to the random bit flip "caused by cosmic rays"? I say this in quotation marks, because the actual probability of this happening .... well, you're will the lotto first, most likely.

 

Also, modern drives to a LOT of error detection and correction. And they do this invisibilty and silently, so you never see it happening. The above may be common in older drives, but anything remotely modern (eg, using SATA, and even a bit before that) is not very prone to these issues.

 

I'm definitley familar with Plex. I was using it recently, until trying MediaBrowser/Emby again. Emby works a lot better for my usage. But both are good programs!

As for the TV tuner stuff, yup. :(

It really sucks when everything is on at the same time. 

 

Do you have the manufacturer's software installed? If so, then that could definitely be the cause. It may even be intentional, based on it's settings. Worth checking out, and digging into.

Personally, I just use the default Windows management for batteries. It works really well, for the most part.

 

Referring to all of the above, actually. The hardware longevity is definitely part of it and I hadn't realized that Scanner mitigated against that. I don't mind spending the money if I was staying on Windows, but ... not convinced yet. ZFS also does a couple of other things that I really like (incremental replication in particular - makes my off-site backup so much quicker) and it's a bit more efficient on space side. With duplication enabled on everything (yes I know, I *could* just do it for a couple of essential folders, but I'm lazy and don't want to copy everything back if something fails), DP makes my current 10x2TB stack of disks a 10TB pool, while a RAIDZ3 would give me a 14TB pool and a RAIDZ2 would give me a 16TB pool. Eventually I think ReFS will get there, but having tried testing ReFS and watched it unable to recover from transient errors, I'm going to stick with ZFS on the reliability side. That said, ZFS is also a PITA when it comes to creating pools from arbitrary sized disks, and I have a whole bunch of not-same-size disks that I need to think about. Now that the WS is up and running, I can postpone the decision back to summer when I have some time to tinker.

 

I do have the APCd installed - it used to tell the other server the power was gone as well. I think it still does need to be installed to start the server back up again after an orderly shutdown because as far as the BIOS is concerned, an orderly shutdown isn't a powerloss so the machine doesn't come back up. I've fixed the power plan for now, but c'est la vie.

 

ETA: we should probably move the last few posts to a new thread in off-topic. Sorry.

Link to comment
Share on other sites

  • 0

I do go into the StableBit Scanner's surface scan here:

http://blog.covecube.com/2014/10/why-using-stablebit-scanner-is-a-good-idea/

It's a .... bit of a read, though.

 

ANd here is the manual section (from WHSv1's StableBit Scanner version, but the code hasn't changed significantly)

http://stablebit.com/Support/Scanner/1.X/Manual?Section=Surface%20Scanner

Also, a bit of a read.

 

 

But basically, the surface scan attempts to read from each and every sector on the disk, once a month (configurable, default setting). This means that it makes sure that the data is readable. It doesn't check it's value... just if it's accessible. However, this can (and some cases does) trigger the drive's built in error correction routines. Meaning that it may remap the sector, or repair it.

This entire process is usually called "data scrubbing". 

 

Additionally, if Scanner detects damage, and DrivePool is installed on the same system, it will cause DrivePool to automatically evacuate the contents of the disk. This is done to preserve the integrity of your pool, as this indicates a serious issue, and in a lot of cases ..... can and will spread to more of the disk (it's a physical defect or damage, and usually isn't just affecting one sector).

 

 

As for ReFS... it's very new, and with time it will get better. That said, the write speeds are still pretty atrocious, from some of the stats I've seen... But it does have the build in error checking.... that only really works well with a Storage Spaces parity or mirrored array (it will read from another disk to repair the damaged file!)

 

 

As for disk savings... what you're saving in disk space, you're offsetting in CPU cycles. Parity and similar is done by mathematically compressing the files. Depending on your system... that extra CPU cycles can grind your system to a hault. And depending on how often this may happen, the offset in energy may have paid for that extra disk space.  It's something to at least consider, though, it probably won't be that drastic. 

 

As for APCd... IIRC, I've seen others with issues with it in the past. So it could very much be the issue.

Link to comment
Share on other sites

  • 0

I do go into the StableBit Scanner's surface scan here:

http://blog.covecube.com/2014/10/why-using-stablebit-scanner-is-a-good-idea/

It's a .... bit of a read, though.

 

ANd here is the manual section (from WHSv1's StableBit Scanner version, but the code hasn't changed significantly)

http://stablebit.com/Support/Scanner/1.X/Manual?Section=Surface%20Scanner

Also, a bit of a read.

 

 

But basically, the surface scan attempts to read from each and every sector on the disk, once a month (configurable, default setting). This means that it makes sure that the data is readable. It doesn't check it's value... just if it's accessible. However, this can (and some cases does) trigger the drive's built in error correction routines. Meaning that it may remap the sector, or repair it.

This entire process is usually called "data scrubbing". 

 

Additionally, if Scanner detects damage, and DrivePool is installed on the same system, it will cause DrivePool to automatically evacuate the contents of the disk. This is done to preserve the integrity of your pool, as this indicates a serious issue, and in a lot of cases ..... can and will spread to more of the disk (it's a physical defect or damage, and usually isn't just affecting one sector).

 

 

As for ReFS... it's very new, and with time it will get better. That said, the write speeds are still pretty atrocious, from some of the stats I've seen... But it does have the build in error checking.... that only really works well with a Storage Spaces parity or mirrored array (it will read from another disk to repair the damaged file!)

 

 

As for disk savings... what you're saving in disk space, you're offsetting in CPU cycles. Parity and similar is done by mathematically compressing the files. Depending on your system... that extra CPU cycles can grind your system to a hault. And depending on how often this may happen, the offset in energy may have paid for that extra disk space.  It's something to at least consider, though, it probably won't be that drastic. 

 

As for APCd... IIRC, I've seen others with issues with it in the past. So it could very much be the issue.

 

Thanks - will read. No problem on long reads - I came on the internet to get away from the inane video on the TV and I'm grievously offended that it's followed me here.

 

Not worried terribly by processor usage. I currently have a Core 2 Duo E8600 in the EX490, 4GB of RAM and a 24TB pool, of which about 14TB are in use. zpool scrub takes about six hours or so, so I just have it run once a week on a day I'm not likely to be using the server (Friday nights are great for this) - and of course, it also transparently calculates parity on access and fixes the file if there's an issue in between. So far, so good. Plex's indexing engine is way more demanding than the zfs scrubbing, truth be told. I'll read up tomorrow though. Thanks :)

Link to comment
Share on other sites

  • 0

:)

 

And yeah, man plex is sure a resource hog. Is part of why I switched to Emby. It's .... got it's moments (I've never seen a 54 queue length on a SSD before.... but that was a "one time" deal because of the switch from the gd library to image magick). 

 

But as long as you're comfortable with ZFS and how it works, then whatever product fits your needs!

Link to comment
Share on other sites

  • 0

Useful reads both, thanks. I do find the units for many of the SMART attributes to be lacking. For example, reallocated sector count on the current spinning disk is reporting Value: 100; Worst: 100; Threshold: 5; Raw Value: 0, which suggests that the disk is at its worst state now, well above the threshold, but in reality means the disk is fine.

 

Let me give it some more thought. As I said, the money isn't the issue; I just want to make sure I'm staying on Windows. One thing I've discovered overnight is that both Parallels and VMWare support snapshotting the Windows VM, which is infinitely useful when an update goes bad. That sounds like a really useful feature and not something that can be done easily if you're running Windows bare-metal. Eh well.

Link to comment
Share on other sites

  • 0

Well, in StableBit Scanner, we don't really display the RAW SMART data. We do our best to interpret all of it and present it in a meaningful way.

This way, you can see how many reallocated sectors you have, not just the raw, nearly unreadable data.

 

 

As for the snapshots, be careful. These definitely do work, but they can also adversely affect performance, especially in the long run.  That is because thye work by creating differentiated virtual hard drives. Meaning that they create a secondard drive, linked to the original, but with only the changes.  Enough of these, and ... well, I'm sure you can see why it could affect performance.

 

 

Another option is Windows Backup. It uses VHDs to create snapshots of the system as well, but on a dedicated drive...

If you're using Windows 8... then you pretty much screwed, as there is no UI. However, you can still use "WBADMIN" (command line util) to create backups.

 

For instance, my server has the normal backup, and I've added a once a week backup additionally, that backs up the system to a completely different drive.

This is the command that I run:

 

wbadmin start backup -backupTarget:T:\ -include:C:,H:,W:,Y: -systemState -allcritical -quiet

Some of the flags aren't needed (like systemState), but I'd rather be safe.

the C:\ drive is the system, H:\ is my hyperV storage (deduplicated), W: is WDS (PXE/network boot) and "Y:\" is my work "drive" for projects.

 

This backs up the system to the "T:\" drive. But I could add a full path and the like there, I believe. 

And you could set up your Windows VM to do this, as well. Just use the Task Scheduler to do this on a schedule. 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...