Jump to content
Covecube Inc.
  • 0
Edward

Removing drive from pool

Question

Am in the process of removing a drive from my pool due to repeated SMART warnings (head parking on a Seagate 3tb drive).

 

The process has apparently been going on for some time now (ca. 3 hours).  However, except for the bar at the bottom saying removing drive, there is no evidence that anything is happening. In particualr there is no 'disk activity' being indicated.

 

Is there anything I can do to actually verify that the process is working as it should in the background? Is there any way I can ascertain how long the process is likely to take?  Ballpark figure would be fine.

 

Am running drivepool 2.1.1.561 under WHS11.

 

thx

Edward

 

 

Share this post


Link to post
Share on other sites

22 answers to this question

Recommended Posts

  • 0

Well, I'm sorry to hear about the problem drive.

 

As for the drive removal, that depends on how much pooled data you have on the drive, and the type of files.

 

A lot of small files is going to be a LOT slower than a few large files (even if both use up the same amount of disk space).  So this really makes it hard to predict the amount of time needed. 

 

As for disk activity, you could use StableBit Scanner to view this. Open the UI, right click on the column header and select "performance".

The performance info in the DrivePool UI is only to show activity that goes through the driver (to and from the pool directly), and it doesn't show balancing or duplication activity as this doesn't "touch" the driver.

 

 

If the process is taking too long, then cancel it and using the "Drive Usage Limiter" balancer. Uncheck both the Duplicated and Unduplicated options for the disk in question. This will evacuate the contents of the drive in the background, while keeping the pool accessible.

 

Additionally, you could tweak the StableBit Scanner balancing (same place as the drive usage limiter balancer).  Check both of the options for the "SMART Errors", and this will automatically evacuate the contents in the case of SMART errors on the drives.

 

 

 

 

Additionally, if you do re-attempt to remove the drive, you may want to use the "Duplicate files later". This only moves out the unduplicated data from the drive, and skips the duplicated data. Once the removal is complete, it run a duplication pass and reduplicate the files in the pool, as needed.

Share this post


Link to post
Share on other sites
  • 0

Thanks Chris, lots to chew on.

 

Let me look at this in the morning please (you may recall I'm in London, UK).

 

As of now, some 13 hours after I kicked off the process, it is still reporting 'removing drive'.  I used default remove options (shift-remove).  Quick stats are:

 

unduplicated: 73gb

duplicated: 1,023gb

151gb: other

Free: 1.51TB

 

I will let the process continue overnight.  SMART reported the drive will fail in the next 24 hours.  I hope to get the data evacuated before then.

 

cheers

Edward

Share this post


Link to post
Share on other sites
  • 0

 You are very welcome.  

 

And I'm very sorry to hear about the drive's status. 

 

Hopefully, the drive's removal has has completed.  If not, cancelling again, and using the additional options may be a good idea (force damaged drive removal, and duplicate files later), as these may help with the removal.

 

 

 

Worst case, you can always pull the drive from the system, and remove the "missing" disk from the pool.  

Share this post


Link to post
Share on other sites
  • 0

Christopher

 

Thanks for the follow up.

 

The 'remove' process never completed.  I let it run for about 24 hours.  I then aborted the process.  (BTW, there was no obvious way - for me- of aborting the process so I had to go via task manager).

 

When going back in I noticed that there were updates available so I did those.  (ordinarily I never open the scanner or the drivepool software so did not realise there were updates.  It would be great if a user could have it so that updates are notified via, say, the scanner notification email process.  Just a thought).

 

Anyway the updates seem to have placed me into a new 'measuring' process (but I may have done that manually?).  That process is still ongoing (several hours).  I have also set the options you mentioned so I assume once the measuring process is completed the evacuation of data will start. But maybe it has also started?  It is not clear from the UI.

 

Scanner also started a new scan on all drives and, except for the suspicious drive the other drives have been scanned and reported healthy.

 

I will let things run until all scans, measurements and evacuation have occurred or the drive fails (whichever comes first).

 

I plan on replacing the suspect drive with a WD Red 6tb which seems to have good reports.  But welcome your opinion on that.

 

The poolpart folder on the suspect drive is about 1 tb and comprises > 200k files so I imagine will take some time to evacuate.

 

I discovered a large folder on the suspect drive which was not part of the pool.  In fact it was a copy of the previous OS install.  That had over 80k files and trying to delete that via Windows Explorer (even bypassing the recycle folder) was gonna take over a day. So I paused and remembered good ole DOS.  5 Minutes later and use of del /s/q and rmdir /s/q did the trick. :P

 

I'd highly recommmend that within the Drivepool UI an option for removing a drive using your settings (balancer, uncheck of dup/undups etc) is implemented. At the very least a link to some helpfile.  This will avoid a user having to make the whole drivepool read only.  Also I would recommend that some status update and estimate of completion (within the UI) is provided. Sitting and watching a screen and doubting if real progress is being made is somewhat frustrating. :unsure:

 

cheers

Edward

Share this post


Link to post
Share on other sites
  • 0

For aborting the removal process, there should be a "cancel" option next to the disk.   Though, this may take a bit, as it needs to finish  what it's doing.

 

 

Once the measuring is complete, it depends on the settings. If the drive is still marked as damaged, then yes, it should start emptying the drive.

In fact, it should show blue arrows on the top of the bar graph for the disk in question. The blue arrows show the target for data placement, and if it's at the very beginning of the drive, then it's trying to evacuate everything. 

 

 

 

As for the replacement drives, the WD Reds are a very good choice (as are the Seagate NAS drives, and any other "NAS" type drive).  However, I'd make sure you let StableBit Scanner do a full scan of the drive before putting it into the pool.  The WD Red drives suffer from "infant mortality" (eg, they may die right away), but if they last then they should be good for a few years at least. 

Also, all the NAS drives are rated for 24/7 operation, and have at least some sort of vibration compensation. So they should last a good long time.

 

 

 

I'd highly recommmend that within the Drivepool UI an option for removing a drive using your settings (balancer, uncheck of dup/undups etc) is implemented. At the very least a link to some helpfile.  This will avoid a user having to make the whole drivepool read only.  Also I would recommend that some status update and estimate of completion (within the UI) is provided. Sitting and watching a screen and doubting if real progress is being made is somewhat frustrating. :unsure:

 

I've already discussed this with Alex (the developer) and we'd love to add this as an advanced option.

 

And there are a few things that need/should be added to the manual. And I definitely planned on adding that.

 

As for progress, there is always the process monitor. you can watch and see exactly what is going on. :)

But watching progress bars is like watching paint dry. Even if stuff is changing, it's not going to be fast, and you may not notice it.

Share this post


Link to post
Share on other sites
  • 0

Well this process of dealing with a damaged drive has been not optimal in my opinion. The main thing is the lack of information on the UI as to confirming what, if anything, is happening in the background. Fully recognise that things will take ages but the UI is quite un-informative. Message display along the lines of "please wait whilst the system duplicates files in the background.  This is estimated to take approximately another x hours/days/weeks".

 

I have now pulled the damaged drive from the pool and now wait and see if things rebalance themselves. Couple of things:

1. UI reports missing drive.  Was not obvious that I had to 'remove' the drive from the pool. I was expecting that to be automatic, but I guess it needs to be manual.

2. I received an email saying there was a missing drive (as expected).  However after I removed the drive from the pool I received an email saying "All the missing disks have been re-connected and are no longer missing." which is untrue.

 

Anyway, things now seem to be 'duplicating' again - I assume to duplicate the files/folders MIA (missing in action).

 

One thing that came to mind whilst I was going through all these shenanigans is the lack of ability of finding out what files on specific drives were duplicated or not.  It would be helpful if one could drill down in some way to ascertain this.  For example on one of my drives I'm seeing 72.3gb unduplicated and 2.45tb duplicated. It would be useful to know what the 72.3gb comprises. NB: the only folder on this driver is the poolpart folder.

 

But my key recommendation stands.  That of providing a means for an end user to fast track an evacuation of a drive. One, additional way, of perhaps doing this is to simply zip up all the unduplicated contents of a failing drive and copy the zip file over to a non-failing drive and then duplicate from there. This will avoid, I assume, the vast overhead of dealing with thousands of small files on a failing drive. Of course it is better not to have any unduplicated files/folders. 

 

cheers

Edward

 

Share this post


Link to post
Share on other sites
  • 0

Thank you for the feedback. I've passed it along to ALex (the developer), and we'll see what we can do to improve it. 
Better feedback while removing the disk would be very useful. Even if it's something as simple as showing which files we're working on. 
 
Getting an accurate estimation would require indexing all the files on that disk, applying some sort of algorithm to estimate the speed of the file creation and overall disk speed. Though, a general estimate: 4 hours per TB.
https://stablebit.com/Admin/IssueAnalysis/18757
 
As for the missing drive, removing it by default isn't always a good idea.  What happens if the disk "fell out" of the pool? Such as due to a flaky controller, failing hardware, or somebody tripping over an external's power cord.  Removing it in this case could be problematic. 
 
As for the email, that definitely needs to be fixed. Though, that's more of a wording issue here, than anything else.
https://stablebit.com/Admin/IssueAnalysis/18758
 
As for the duplicating, any time you have a missing disk like this, it will recheck the pool, And when you remove a disk, it will recheck the duplication, just in case. It will check the status of the duplication and then reduplicate the files as needed. 
 
 

One thing that came to mind whilst I was going through all these shenanigans is the lack of ability of finding out what files on specific drives were duplicated or not.  It would be helpful if one could drill down in some way to ascertain this.  For example on one of my drives I'm seeing 72.3gb unduplicated and 2.45tb duplicated. It would be useful to know what the 72.3gb comprises. NB: the only folder on this driver is the poolpart folder.

Yes, we do need some sort of report/auditing tool to determine what files are properly duplicated or not, for end users. This has been brought up in the past, and we do plan on implementing it.

https://stablebit.com/Admin/IssueAnalysis/18756

I've "bumped" the priority on it, as we really do this this.

We also do plan on overhauling the feedback system, specifically to address some of these concerns that you've mentioned.
 

But my key recommendation stands.  That of providing a means for an end user to fast track an evacuation of a drive. One, additional way, of perhaps doing this is to simply zip up all the unduplicated contents of a failing drive and copy the zip file over to a non-failing drive and then duplicate from there. This will avoid, I assume, the vast overhead of dealing with thousands of small files on a failing drive. Of course it is better not to have any unduplicated files/folders.

The problem with that, is that zipping up the files, compressing them (or not), and creating the archive would take just as long, if not longer than just moving the files normally. Adding the files to an archive means that you need to read the source file, and then write the file parts on the destination archive.

And because you'd have to create the data in the archive for each file, AND add the record for it, the added overhead would probably actually slow this process down.

Share this post


Link to post
Share on other sites
  • 0

Thanks for the feedback Christopher.  As always great feedback and good transparency.  A model that other businesses could well emulate and benefit from.

 

My state of play now is that my pool is smaller (due to the pulled drive) and everything looks clean with all folders etc duplicated (save for some annoying long name files which I can't even kill or rename in DOS).

 

New WD Red 6tb drive is currently being scanned and once it comes up clean overnight I will add to the pool. Thanks for your tip about possible infant mortality.

 

I'm interested to see how the duplication balancing works out given the significant difference in drive sizes (pool will be comprised of 2*2tb, 1*3tb and 1*6tb). The pool will need to survive the 6tb drive going south.

 

One good thing about all of this is I discovered an old temporary backup of something which was about 900gb which I no longer need and once I killed that my pool free space increased by 1.8tb. The not needed files were lurking deep in a folder somewhere. <oops>.

 

 

 

Edward

Share this post


Link to post
Share on other sites
  • 0

You are very welcome. 

And we definitely agree.  

 

 

As for the long file names, yeah, they're very annoying to handle properly. The only working solution I've found for them, is to rename parent folders until the path is short enough. And hope that the actual file name isn't too long.

 

 

 

As for the WD Reds, you're very welcome.  I have 8x 4TB WD REDs, and I've not experienced, but I've seen enough reports about this to be concerned.  However, I always let Scanner perform a full scan of any drive before adding it to the pool.

There is a request to automate this in the Scanner balancer, eg, not place files on a disk that hasn't been scanned), but between CloudDrive issues, and adding Windows 10 support (which are priorities), he hasn't been able to get to it yet.

 

 

As for how well the duplication and balancing are handled, well, Alex (the developer) as put a lot of time and effort into this. The code around it is fairly complicated, and does a very good job.  

Worst case, you may say a chunk of space that is marked as "Unusable for Duplication". If that happens, then that means that we can't properly balance data around to ensure that this chunk of data can be used for duplication properly. AKA, we warn you when it will be a problem.  However, given time, the balancing engine may reduce the size of this space as it reorganizes files.

 

 

 

One good thing about all of this is I discovered an old temporary backup of something which was about 900gb which I no longer need and once I killed that my pool free space increased by 1.8tb. The not needed files were lurking deep in a folder somewhere. <oops>.

It happens! 

Sometimes it's a good idea to go through and inventory what you have on the pool.  Programs like WinDirStat, SpaceSniffer, JDiskReport, etc, are a good way to get a detailed look at that. And they should work with the pool, without any issues.

Share this post


Link to post
Share on other sites
  • 0

Re long names.  Luckily the culprits were in one folder so I simply moved the folder to the root, renamed the offending files and moved the folder back again.

 

Yeah WinDirStat is one of the tools in my toolbox. Simple, clean, fast.

 

Re Windows 10.  I have a Win10 new bare metal install already done, just waiting for you guys to give the green light and I will move my data drives over (thus replacing the WHS11 instance I have been using for ages).

 

Edward

Share this post


Link to post
Share on other sites
  • 0

I'm glad to hear that it was a simple fix, at least.

 

 

And yeah, WinDirStat is really nice. ;)

 

 

As for Windows 10, StableBit Scanner should work on it just fine. We do have a Windows 10 version of StableBit CloudDrive built, we're just waiting on StableBit DrivePool.

Share this post


Link to post
Share on other sites
  • 0

Hi again

 

Just to wrap this thread up.

 

Installed, scanned and added to the pool a new 6TB WD Red.

 

Initially, soon after adding the new drive, I saw about 3.5tb being reported as 'unusable for duplication' but then once all the balancing occurred this all went away and now is being reported as 'free space' so all good.

 

The drive pool is showing a small amount of 15.2GB as unduplicated.  I imagine this is simply some overhead but it would be nice (not essential) to be able to drill down and confirm.  Maybe I set something somewhere to have certain files/folders as not duplicated but I don't where that setting would be.

 

One big plus in having a drive nearly fail on me is that I did a lot of housekeeping which cleared things out nicely. In addition to the not needed 1tb folder lurking around I also deleted a whole load of movie videos which we realistically would never watch again.

 

So now I have just over 7tb free space on a 12tb pool which is a good feeling. With a bit of effort I'm sure I can fill that quickly!  :)

 

Edward

Share this post


Link to post
Share on other sites
  • 0

Well, glad to hear that everything has been wrapped up!

 

As for the "unusable for duplication", that may be normal depending on the exact situation. But I'm glad to hear that the system rebalanced and "settled in" nicely. 

 

As for the small amount of unduplicated data... you can use the "Folder Duplication" section to identify where there is occuring. But it's not exactly intuitive or easy.  An audit/reporting tool has been discussed, and we'd like to have it out, eventually, at least. 

 

 

 

And housecleaning is always a good idea. Everyone should do it at least once a year (do it when you do your normal spring cleaning :) ). And yeah, filling 7TBs is rather easy, if you're really trying. :)

Share this post


Link to post
Share on other sites
  • 0

I've been communicating with Christopher about these same issues - and it's disappointing to see that the exact same discussion was going on 4 years ago  and pretty much nothing has changed. 

I love Drivepool and I think it does a very good job at it's primary intent - but when something is off - when a drive needs removal or a significant rebalancing is needed - the same issues discussed here are still problematic.

I've been trying to evacuate the unduplicated files on a damaged drive for several days now - there are less than 500Gb of undupe files - but the process is still moving glacially. According to Scanner - there are 3 damaged files - I was able to recover one, and I don't care about the other two .  Using the Remove function locked the pool and almost nothing happened after nearly 24 hours.  Using the approach of using the balancing plugins and setting the drive usage limiter to zero as a way to evacuate the drive and/or using the Scanner balancer to remove files from a damaged drive has also barely removed any files.  Worse yet, it seems that each time you abort any process, Drivepool seems to re-measure the pool, which itself has been taking longer than  a day.  

At first, i was having performance problems due to a bad Windows update, but since resolving that, all of the above are still true.   I

I wholeheartedly agree that it would be incredibly helpful to know which files are in the unduplicated group. 

I'm going to try moving some of the files out of the pooled drive and onto an individual drive using Windows explorer - hopefully that will be more efficient.  

I know you guys have been focusing on other products - and I'm happy to see you developing the business and having success - but Drivepool needs some real attention. 

Share this post


Link to post
Share on other sites
  • 0
1 hour ago, Ultradianguy said:

I've been communicating with Christopher about these same issues - and it's disappointing to see that the exact same discussion was going on 4 years ago  and pretty much nothing has changed. 

I love Drivepool and I think it does a very good job at it's primary intent - but when something is off - when a drive needs removal or a significant rebalancing is needed - the same issues discussed here are still problematic.

I've been trying to evacuate the unduplicated files on a damaged drive for several days now - there are less than 500Gb of undupe files - but the process is still moving glacially. According to Scanner - there are 3 damaged files - I was able to recover one, and I don't care about the other two .  Using the Remove function locked the pool and almost nothing happened after nearly 24 hours.  Using the approach of using the balancing plugins and setting the drive usage limiter to zero as a way to evacuate the drive and/or using the Scanner balancer to remove files from a damaged drive has also barely removed any files.  Worse yet, it seems that each time you abort any process, Drivepool seems to re-measure the pool, which itself has been taking longer than  a day.  

At first, i was having performance problems due to a bad Windows update, but since resolving that, all of the above are still true.   I

I wholeheartedly agree that it would be incredibly helpful to know which files are in the unduplicated group. 

I'm going to try moving some of the files out of the pooled drive and onto an individual drive using Windows explorer - hopefully that will be more efficient.  

I know you guys have been focusing on other products - and I'm happy to see you developing the business and having success - but Drivepool needs some real attention. 

I got notified about your new post on this old thread.  Very disappointing to hear no progress has been made since I posted 4 years ago. Let's hope Christopher is able to sort out your immediate issues and longer term some progress is made on the issues Christopher and I discussed 4 years ago.

I've not had any issues with drives since 4 years ago but I guess it is only time .... 

Share this post


Link to post
Share on other sites
  • 0
On 11/16/2019 at 8:49 AM, Ultradianguy said:

I've been communicating with Christopher about these same issues - and it's disappointing to see that the exact same discussion was going on 4 years ago  and pretty much nothing has changed. 

I love Drivepool and I think it does a very good job at it's primary intent - but when something is off - when a drive needs removal or a significant rebalancing is needed - the same issues discussed here are still problematic.

 

Unfortunately, there won't be a lot that really can be done.  If there are issues accessing the data on the drive, then there are issues accessing the data on the drive.  It doesn't matter from where that access comes from. 

That said, there is the "force damaged disk removal" option, which does skip over problem files.  Additionally, the "dpcmd" utility has the "ignore-poolpart" command that will immediately drop a disk from the pool, without moving the data off of the disk. 

The "ignore-poolpart" command was actually added Nov 2015, and in response to this sort of issue, if not this issue explicitly.  

As for the command itself, the "xxxxx" part of the "PoolPart.xxxxx" folder on the disk is the pool ID that you'd need to remove that disk from the pool. (there are actually a couple of ways to get this info, but this is probably the simplest, and most obvious way to identify the disk <-> pool ID relationship. 

 

Share this post


Link to post
Share on other sites
  • 0

Hi Christopher - I don't think this has anything to do with the damage on the disk.  It's only 20 sectors and 3 files, of which I was able to recover 1.  I can open folders on the drive, open files, copy files, etc 

And drive pool is generally taking forever to measure and balance the pool - it goes way beyond this particular drive. 

Share this post


Link to post
Share on other sites
  • 0

Also , as I've explained in our private messages on my ticket,  I tried the force damaged disk removal - its not that it's getting stuck on damaged files -it's that it hasn't even begun removing files from this drive because it's been measuring the pool for days.  It did at one point start to balance but only moved a few files on other drives and barely touched this one.   

Share this post


Link to post
Share on other sites
  • 0

Well, the removal failing is.  And unfortunately, because the disk is having issues, the remeasure is worse (since it's I/O intensive).

  And I don't think the remeasure is normal here, but I'd have to double check.  And that there are disk issues .... if the drive disconnects and reconnects during this process, it will trigger a remeasure to occur, which may be what is happening here.

As for balancing, yeah, that's intentional. 

 

And either way, the "dpcmd ignore-poolpart" command should be pretty instantaneous, and should prevent the issue from occurring (but may trigger a remeasure)

Share this post


Link to post
Share on other sites
  • 0

the disconnects were actually on two other drives - not the damaged ones.  I'm not sure I'm understanding the dpcmd ignore -- I don't know which files on this drive are unduplicated and which are duplicates - so it's not clear to me how I should manually remove the files.  That's why I set the drive usage limiter to remove only unduplicated files - but that hasn't happened at all.  

Feel free to respond on my ticket if you prefer.  

Share this post


Link to post
Share on other sites
  • 0

Fully recognize that the current issue is not mine (but I'm the OP) however would highly appreciate if:

1. How do I find out which files on drives are unduplicated?

2.  That this thread is anyway updated with recommended processes/commands need to be followed when a problem occurs.  Or a link to such processes/commands.

cheers

Edward

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...

×
×
  • Create New...