Jump to content
  • 0

duplication pool setup with nvme taking forever to build bucket lists


Shooter3k

Question

Hello,

I am new to Drivepool and my original plan was to have about 10 hdd's with 2 nvme's. I would then write the data to nvme's and then when the balancing would run, it would move all the files onto the hdd's. However, it's turning out that the balancing takes 2-3 days to run each time. So, assuming that hourly/daily data would be written during the balancing run is proving to not work. 

Is there any options or things I'm missing here? Is this by design? 

Link to comment
Share on other sites

8 answers to this question

Recommended Posts

  • 0
12 hours ago, Shane said:

Hi, how do you have your pool(s) arranged and what settings are you using for the balancing?

I have six 18TB drives, two 16TB drives, and two 8tb nvme drives. 

Basically, I'm looking for duplication and quick read/write. My thought was to write to the pool and use the 8tb drives as a cache/stage point and then over time they would write to the hdd's in the pool but I'm writing more than 8tb of data before they have a chance to get written out to hdd's plus it takes days/weeks of 100% cpu usage (which isn't ideal either). 

I've tried all sorts of different options but if there's an ideal configuration, I'm all ears. 

 

 

Link to comment
Share on other sites

  • 0

The bucket lists are the files (and folders) that the balancer determines needs to be moved between poolparts. Does your data set involve a very large number of files? Do you have balancing set to immediately trigger on any amount of data so that the cache drives effectively act as a real-time-ish buffer or is it scheduled daily so files aren't moved off until a certain time? Is your duplication real-time or nightly, and at what multiplier(s)?

If it helps I believe you can see what files are being added to the bucket lists via opening the Service Log ( DrivePool -> Settings -> Troubleshooting -> Service log... ) and setting the File Mover trace level from Informational to Verbose ( Tracing levels -> F -> File Mover -> Verbose).

I'd be going with the SSD Optimizer balancer to have the two 8TB drives as cache, which I'm guessing is what you're doing, but even filled they shouldn't be taking weeks to empty unless there's some kind of bottleneck going on and 100% CPU for that length of time also seems a red flag. What CPU do you have? If you open Windows Task Manager can you check in Performance to see if there's high kernel times (right-click the graph) and in Details can you see any culprit process(es)?

Link to comment
Share on other sites

  • 0
2 hours ago, Shane said:

The bucket lists are the files (and folders) that the balancer determines needs to be moved between poolparts. Does your data set involve a very large number of files? Do you have balancing set to immediately trigger on any amount of data so that the cache drives effectively act as a real-time-ish buffer or is it scheduled daily so files aren't moved off until a certain time? Is your duplication real-time or nightly, and at what multiplier(s)?

If it helps I believe you can see what files are being added to the bucket lists via opening the Service Log ( DrivePool -> Settings -> Troubleshooting -> Service log... ) and setting the File Mover trace level from Informational to Verbose ( Tracing levels -> F -> File Mover -> Verbose).

I'd be going with the SSD Optimizer balancer to have the two 8TB drives as cache, which I'm guessing is what you're doing, but even filled they shouldn't be taking weeks to empty unless there's some kind of bottleneck going on and 100% CPU for that length of time also seems a red flag. What CPU do you have? If you open Windows Task Manager can you check in Performance to see if there's high kernel times (right-click the graph) and in Details can you see any culprit process(es)?

There's about 3 million folders and 15 million files 

I've tried changing the balancing options around to see what is quicker but at the end of the day (from real time to batchedup), when the data needs to be balanced from the nvme to hdd, it takes days. I've tried changing the duplication from real-time to nightly and with two nvme's, the most effective seems to be real time. The multiplier is 2x

My CPU is Ryzen 9 5950X

I agree; something seems like it's waving red flags. Is it because of the number of folders and files?

I will check into the kernel times...but the thing using the most CPU is "DrivePool.Service.exe"

Link to comment
Share on other sites

  • 0

My DP machine is a Ryzen 5 with 16GB RAM and about 750k files (a twentieth of yours) which doesn't take very long to balance but the vast majority of those files are balanced already; if your pool has a lot of steady churn from ssd to archive that would also multiply the problem, though I still would think a Ryzen 9 shouldn't be getting stuck at 100% for days (let alone weeks) regardless.

Yeah, I'd stick with real-time x2 duplication. If you want files to offload from SSD to Archive pretty much as soon as they land to keep the SSD from filling, set Automatic balancing to "Balance immediately" and Triggers to "1GB" or similar. You might also try "Not more often than..." if you wish to try an hourly emptying or similar. To minimise bucket list computation, disable all non-essential Balancers.

I don't know if it'd help, but maybe try having a Pool A = Pool B (SSDx2) + Pool C (HDDx10) arrangement where:

  • Pool A = no duplication, balance immediately / not more often than, with 1GB trigger; use SSD Optimizer (SSD = Pool B, Archive = Pool C), no other balancers.
  • Pool B = x2 real-time duplication, automatic balancing off, all balancers disabled (except maybe StableBit Scanner).
  • Pool C = x2 real-time duplication, automatic balancing nightly (StableBit Scanner plus anything else you need).

That way - in theory - pool A is where all your files "are" (from the user perspective) and only has one main job, "use B as a buffer for C", while pool C keeps the HDDs balanced as its own job scheduled separately from the cache emptying? Note that Pool B doesn't need Duplication Space Optimizer because the duplication multiplier matches the number of drives.

If nothing seems to fix the CPU/bucket issue, I'd suggest opening a support ticket with StableBit.

Link to comment
Share on other sites

  • 0

When a pool balances, does it have to compute every file within the entire pool?

Everything I'm trying so far is still causing 100% cpu (for days) whenever it tries to balance the pool (I assume because there are so many files and it checks every file????) 

 

Link to comment
Share on other sites

  • 0

It should only compute as much as it needs to so that it can meet the requirements of each balancer (sorry, I realise that's vague).

But if you've still got 100% cpu going on I'd recommend opening a ticket so StableBit can help directly.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...