Jump to content
  • 0

Server Stuck on Measuring After a Completed Re-Balance


BYUSinger84

Question

  • 0

I'm having an issue with my server that hopefully someone here knows what I can do to resolve it. Background information:

 

WHS 2011 - Core 2 Quad 9400, 6 Gigs of RAM, ten hard drives most of which are 2 tb and 1 3TB. Duplication is enabled on all folders. 

 

A few nights ago I thought I'd give StableBit's Drive Scanner a try. I'm glad I did because the next day it alerted me that one of my drives had bad sectors on it. At that time I had about 500 gigs free on the pool with the bad drive being a 2 TB drive. Because the data was all duplicated, I thought well...let's just take it out and replace it. I already had a 3 TB I had been meaning to put in anyway so this seemed like as good a time as any. I removed the damaged 2 TB drive from the pool, shut down the server, added the new drive, and then booted back up. I was able to add the 3 TB drive without any issues at all. Duplication began and once that was complete, the measuring process started (rather quickly I might add). It went through the first drive, no problems, then the second (the 3 tb), then as it got about 1/5 of the way through the 3rd drive everything starts running really slow and I can barely click on anything. Measuring got to about 15.8% when it started freezing. The dashboard freezes as well as the task manager. The last reported CPU utilization was about 25% and the RAM usage was down around 30%. Now I'm at a point where I can't even log back into the server after having logged off thinking that may help me out. My shared drives still work but it takes several minutes to open a folder. I don't even want to know what will happen if I try to open a file. Because this is running in Hyper-V (which I've been doing for years now without issue), I can still see that the CPU Usage bounces between 22-24% which indicates to me that something is going on. My question is, do I just leave it here doing it's thing for the next 24 hours? Does anyone have any idea what the deal is? I'm running the latest DrivePool 1.3 that's out at the time of this writing. Thanks for any thoughts in advance. 

Link to comment
Share on other sites

  • Answers 54
  • Created
  • Last Reply

Top Posters For This Question

Recommended Posts

  • 0

If you are able to get to the task manager open again, find the "DrivePool.Service.exe" file, and select the "create dump file" option. Once you've done that, upload it to us.

Specifically, this:

http://wiki.covecube.com/StableBit_DrivePool_Dashboard_Freeze

It says dashboard, but do this for the service.exe. This dumps the memory used by the service, and we can take a look at that, to see what is going on.

 

Also, there is this as well:

http://wiki.covecube.com/StableBit_DrivePool_System_Freeze

 

 

Once you've done that, give us a bit, and Alex (the developer) will take a look a them and try to see what is going on, and fix it if necessary. 

 

 

Also, are you on the 1.3.2.7556 build (dashboard will report it as 1.3.7556)?

If so, do this:

http://wiki.covecube.com/StableBit_DrivePool_Q8964978

 

And then install this version:

http://dl.covecube.com/DrivePool/release/download/StableBit.DrivePool_1.3.1.7541_Release.wssx

Link to comment
Share on other sites

  • 0

I can't do anything once it freezes. I just had the task manager up when it froze so I could see what processes were using CPU cycles the most. 

 

I'm currently using the latest beta 2.x build as the current 1.3x builds won't install (see my previous posts). Do you want me to uninstall the 2.x using this same method and then try to install the one you suggested? 

 

The dump I sent you is following the wiki instructions. I uploaded it with my username so you'd know who it was from. 

 

Also, you two are more than welcome to remote into my server if that helps troubleshoot at all. Let me know. 

Link to comment
Share on other sites

  • 0

I did all of those already. The common denominator was drivepool. Once I rolled back to the version drashna specified, everything started working again. 

 

I still don't have it integrated to my dashboard (the full uninstall doesn't help), but it sounds like you are moving away from that anyway so I guess I'll just learn to like the new interface. 

Link to comment
Share on other sites

  • 0

As a mainly 1.3.x user, based on past experience and posts I expect Alex to continue to port bugfixes - and features - from 2.x to 1.x, as parts of the codebase are shared between them. He is however only one person, and I imagine debugging kernel/filesystem code ain't easy.

 

Also, build 320 of the 2.x tree has just been released which fixes a problem introduced in 312, so if you install 320 and it works (or doesn't work), that would presumably help further narrow down the cause.

Link to comment
Share on other sites

  • 0

An update. So, I was going to try the other versions, but then, of course, everything broke again. I couldn't believe it. After working solidly for a few days, it reverted back to what it was doing before. In an attempt to further track down the problem, there was only one other common denominator: StableBit's Drive Scanner. I thought, "This can't possibly be it," but after thinking about what had happened leading up to the days, moments, before everything stopped working (after working fabulously for years, I might add) aside from the update to DrivePool at the time, I also bought and installed Drive Scanner not only on the VM, but on the host machine as well (the VM can't see the SMART status of drives). I thought, it's worth a shot, so I uninstalled it on both the VM and the host. And what do you know, everything is working like normal again. Obviously, I'd like to use the software I paid for, but if this was really the problem the whole time, I ask myself if the dynamic of having WHS 2011 on a Hyper-V is part of the issue as well. 

 

Thoughts? Comments?

Link to comment
Share on other sites

  • 0

Can you find out whether it's on the guest or host side, or the combination of the two, that having Scanner installed is causing the lock up? And whether it's from Scanner's SMART monitoring or from its scheduled scans?

Link to comment
Share on other sites

  • 0

Ok, nevermind. Honestly I have no idea what the heck is going on. I thought things got back to normal once I uninstalled Scanner from both machines but now the problem has resurfaced again. I've also tried the latest version of the drivepool now. Same problem. 

 

What the heck is going on?!

Link to comment
Share on other sites

  • 0

The only thing I can say for absolute certain is that it always freezes during measuring or calculating. Balancing ALWAYS works perfectly. Within minutes of getting to calculating/measuring, the GUI and the network shares hard lock and leaving it overnight does nothing yet the CPU is obviously working hard doing something as it bounces between 30-75%. Any other thoughts Alex? Would you like remote access at this point? I don't know what else to try. Thanks.

Link to comment
Share on other sites

  • 0

Hi

Have you checked all the drives for errors the last time I had 1 go bad I couldn't even get into the dashboard and I also sufferd freezing of windows when trying to access the drive I had to pull 1 drive at a time until I got the system to boot then I once I found the bad drive I put it in another machine ran chkdsk and scanner drive had 34 bad sectors.

 

 

Even if not pull all the drives then add 1 at a time see how far it gets you never know

Link to comment
Share on other sites

  • 0

I've pulled the drives out of the Hyper-V server and now are native on my Host. We'll see what happens. In the mean time, I do have backups that I CANNOT lose on the pool. If I rebuild the server with WHS 2011 as the primary OS and then attach my drive pool, is there a way to reattach my backups?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

Announcements


×
×
  • Create New...