Jump to content
Covecube Inc.
  • 0
Sign in to follow this  
vfsrecycle_kid

Can/Does DrivePool preemptively calculate the total free space post-replication-rebalance?

Question

Hello,

Hopefully this question makes sense. I have a big DrivePool and one of my 8TB drives just died. While waiting for my Ironwolf 16TB to replace it, re-balancing across the other drives has begun.

However, it will be a multi-day rebuild, and drive space will definitely be tight - leading to my questions:

1. Is DrivePool smart enough to know if there's enough space to rebuild with my replication rules (2X globally, 3X for special folders) or not?
2. Does DrivePool know how much free space will be available once rebuilding is finished?
3. Is that presented to the user somehow?

I see metrics like "Free Space" and "Unduplicated" and while Free Space > Unduplicated, I assume my duplication factor will determine whether there's enough space or not.

Cheers

Share this post


Link to post
Share on other sites

11 answers to this question

Recommended Posts

  • 0

Not entirely but pretty sure DP will not be able to tell whether there is enough space, nor how much will be available afterwards. But yes, if the Pool had over 8TB free prior to the failure of one 8TB drive then there should be enough space available (probably rare exceptions excluded).

Share this post


Link to post
Share on other sites
  • 0

I had the same problem last month, or is that 2 months now

DP is smart enough to re-balance, but it will take awhile; I think mine took 44 hours, that is just under 2 days thou

Then when you put in the 16TB (I wish Santa bought me one of them last week) it will re-balance again; and again it will take time, but it will be done

I hope this helps you

 

Share this post


Link to post
Share on other sites
  • 0
On 12/31/2019 at 3:58 AM, Umfriend said:

Not entirely but pretty sure DP will not be able to tell whether there is enough space, nor how much will be available afterwards. But yes, if the Pool had over 8TB free prior to the failure of one 8TB drive then there should be enough space available (probably rare exceptions excluded).

That's why it's so complex, it is up to what the replication rules for the lost-data was set to.

For example, the balancing finished and I was left with 1TB free (and with my replication rules in place that realistically means 0.5TB useable).

I guess the ultimate question is if there's a way to calculate estimated actual disk space free.

Share this post


Link to post
Share on other sites
  • 0

No. If, and only if, the entire Pool had a fixed duplication factor then it *could* be done. E.g., 1TB of free space means you can save 0.5TB of net data with x2 duplication or .33TB with x3 duplication etc. However, as soon as you mix duplication factors, well, it really depends on where thre data lands, doesn't it? So I guess they chose to only show actual free space without taking duplication in mind. Makes sense to me. Personally, I over provision all my Pools (a whopping two in total ;D) such that I can always evacuate the largest HDD. Peace of mind and coninuity rules in my book.

Share this post


Link to post
Share on other sites
  • 0
5 hours ago, Umfriend said:

No. If, and only if, the entire Pool had a fixed duplication factor then it *could* be done. E.g., 1TB of free space means you can save 0.5TB of net data with x2 duplication or .33TB with x3 duplication etc. However, as soon as you mix duplication factors, well, it really depends on where thre data lands, doesn't it? So I guess they chose to only show actual free space without taking duplication in mind. Makes sense to me. Personally, I over provision all my Pools (a whopping two in total ;D) such that I can always evacuate the largest HDD. Peace of mind and coninuity rules in my book.

Well that's the part I feel like could be pre-calculated. Balancing with the rules you have in place should be deterministic, no?

At least in my mind balancing is following a plan, hence the percent completed being able to be displayed.

Share this post


Link to post
Share on other sites
  • 0

But imagine you are going to write a new file to the pool. How would DP know whether it will have to write two or three duplicates? That is not deterministic. Put otherweise, imagine you have two folders only, one is x2 and the other is x3 duplication and 3TB of (gross / real) space available. What should DP show? 1.0 or 1.5TB free?

Share this post


Link to post
Share on other sites
  • 0
22 hours ago, Umfriend said:

But imagine you are going to write a new file to the pool. How would DP know whether it will have to write two or three duplicates? That is not deterministic. Put otherweise, imagine you have two folders only, one is x2 and the other is x3 duplication and 3TB of (gross / real) space available. What should DP show? 1.0 or 1.5TB free?

Correct, in that case you don't know until the files are added to the pool and that is a different beast all together since you're talking in the realm of useable vs true disk space. However this is about resatisfying existing content replication rules. DP knows what was lost, how it needs to be replaced, and ultimately knows it needs X disk space to satisfy it.

To further your point, I concede now that a "free space" counter does not make sense, however a "disk space needed for resatisfying replication" seems deterministic to me - in the case of a dead drive being replaced.

A super simple example:

1. DrivePool - 4x4TB drives in global 2x replication. 8TB useable disk space, reported as 16TB to OS (no problems here)
2. DrivePool is 100% full meaning 16/16TB is being used (8TB of content replicated 2x)
3. DrivePool loses a drive due to hardware failure. No data loss, but replication rules are now unsatisfied
4. At this point, 3x4TB leaves 12TB of pooled space with approx 4TB* of unreplicated but not lost data

With the above example, to me it seems fairly obvious and deterministic that DrivePool should be able to recommend please insert another drive with at least 4TB* in size

Replace the above example with a myriad of replication rules, folders, etc, and the math should still check out - and maybe DP could recommend that to end user?

Leading us back to my initial topic, which if its easily demonstrated how much data is needed to rebalance properly, you can also say how much data space will be left post balance.....as in, if a 4TB* drive is needed, but you insert a 8TB, you already know you'll have 4TB free in the end.

Share this post


Link to post
Share on other sites
  • 0

Actually, once a drive is lost, it would take quite some time to determine how much would be needed to re-establish the required duplication. DP would not have been aware or remember what was stored on the lost drive. The first step is called re-measuring (try it). And yes, as a result of that, DP might be able to give a recommendation on how much, if any, additional storage is needed. But practically, if you know the size of the lost drive, then all you need to know is: how much data was on it and how much do you have available now.

Your math does not work out BTW. After the failed drive, you have 4TB of duplicated data (using 8TB) and 4TB of unduplicated data and you would need a 4TB drive.

Share this post


Link to post
Share on other sites
  • 0
7 hours ago, Umfriend said:

Actually, once a drive is lost, it would take quite some time to determine how much would be needed to re-establish the required duplication. DP would not have been aware or remember what was stored on the lost drive. The first step is called re-measuring (try it). And yes, as a result of that, DP might be able to give a recommendation on how much, if any, additional storage is needed. But practically, if you know the size of the lost drive, then all you need to know is: how much data was on it and how much do you have available now.

Your math does not work out BTW. After the failed drive, you have 4TB of duplicated data (using 8TB) and 4TB of unduplicated data and you would need a 4TB drive.

Maybe not immediately, hence why I said "can it calculate" - I'm aware it wouldn't instantaneously know and re-measuring is necessary.

- Losing a disk - can the existing drives hold all the data and retain replication property (pool was not full)
- Losing a disk - how big of a drive do you need to replace it to retain replication property (pool was full)

My thing is DP should be able to figure out both BEFORE duplicating simply fails from lack of disk space. Right now all you get is a percent of duplication progress and for rebuilds that take multi-day you don't want it to randomly fail at 32% and it takes days to notice (yes, some people actually use this on their servers and don't check it daily)

Also regarding the math, I've edited my post. This proves I can't even trust myself and I'd much rather trust what DP tells me. Thanks for the correction.

Email notifications (which I use and adore) augmented with such information would be fantastic.

Share this post


Link to post
Share on other sites
  • 0
7 minutes ago, vfsrecycle_kid said:

Maybe not immediately, hence why I said "can it calculate" - I'm aware it wouldn't instantaneously know and re-measuring is necessary.

- Losing a disk - can the existing drives hold all the data and retain replication property (pool was not full)
- Losing a disk - how big of a drive do you need to replace it to retain replication property (pool was full)

My thing is DP should be able to figure out both BEFORE duplicating simply fails from lack of disk space. Right now all you get is a percent of duplication progress and for rebuilds that take multi-day you don't want it to randomly fail at 32% and it takes days to notice (yes, some people actually use this on their servers and don't check it daily)

Also regarding the math, I've edited my post. This proves I can't even trust myself and I'd much rather trust what DP tells me. Thanks for the correction.

Actually, I think you are right. Once DP has worked-out which files should go where (I think that happens when DP status bar says "building bucketlist"), it should at that point be easy to see if there is space enough. I would still want DP to go ahead, because it has to happen anyway, but perhaps send out a mail or somesuch indicating that more space is needed. Then again, it is coding for something that is, I think, rarely an issue. Once you need to replace a drive, insert on of at least equals size. That is, I think, by default what most would do anyway.

I run DP one a WSE2016 server BTW.

Share this post


Link to post
Share on other sites
  • 0

Just noticed this

image.png.30a773e2f6a36d57015bafaea04ecd2d.png

I've now added a new drive after previous lost-disk rebalance was completed, old drives have a minus number (freeing up space) and new drive has a big number for where it'll increase to.

This is probably adequate enough for most purposes.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
Sign in to follow this  

×
×
  • Create New...