Jump to content
  • 0

To All - What performance increase do you get from Read Stripping?


B00ze

Question

Good day everyone.

As I mentioned somewhere else, I get about 12% more performance copying large files from a duplicated pool as I get from copying those files from the same disks, outside of the pool. So from the disks, I get 200MB/s, and from the pool I get 225MB/s. The performance increase is really low. What performance do you all get from Read Stripping?

In preparation for this post, I copied more and larger files (duplicated) from the pool to the SSD but this time I switched to the "Disk Performance" TAB of DrivePool, and discovered that DP is barely ever using both disks, even tho the "Read Stripping" gauge on the "Pool Performance" TAB says 100% read stripping. If "Disk Performance" is accurate, DP reads mostly from a single drive. So I tested again and this time used Resource Monitor to watch the disks and sure enough, DP is pretty much never reading from both disks at the same time, it just switches from one to the other back and forth (see Screenshot) but it's mostly using only 1 of the disks (90% I/O is on a single disk). Is this normal? Both disks are the same model, both connected to the same controller.

Thank you.

Link to comment
Share on other sites

4 answers to this question

Recommended Posts

  • 0

http://stablebit.com/Support/DrivePool/2.X/Manual?Section=Performance Options

Quote

StableBit DrivePool utilizes a number of read striping algorithms, depending on the situation.

For large sequential I/O, such as large file copying, read striping will utilize a block based algorithm, maximizing the use of each disk and minimizing disk context switches.

For random non-sequential I/O read striping always sends the request to the disk with the least outstanding requests. Because seek times can be high in this scenario, StableBit DrivePool tries to switch disk contexts often.

For slow non-concurrent I/O read striping passively measures the speed of each disk and dynamically switches the the fastest disk.

You can see this in action under the Performance UI.

So, it may not be reading from both disks, in many circumstances.  

 

 

Link to comment
Share on other sites

  • 0

Good day.

But the tests I run are large sequential copies, they are the best candidate for read stripping, and DrivePool is not really doing it.

Quote

Maximizing the use of each disk and minimizing disk context switches.

It's clearly maximizing the use of ONE disk, and clearly it is trying to limit going from disk to disk, but "disk context switches" taking any time at all is not something I am familiar with. If you want to read-strip, what you do is allocate a big buffer and ask disk #1 to fill it. You do not wait for this to finish, you immediately allocate another buffer and ask disk #2 to fill it, starting further along the file. If both disks are as fast then you should be maximizing disk context switches, not trying to minimize them. Each time a block finishes you can calculate how long it took and figure out which disk is faster and adjust accordingly (if disk #1 is faster, you ask it for the next block, and ask for the THIRD block out from disk #2, meanwhile disk #1 returns and you ask it for the block in between.) But my disks are all the same; if DrivePool did striping optimally I should see 50% I/O on each disk. Instead what I see is 90% I/O on one disk. Read Stripping is not as important as everything else DrivePool does, but there is clearly room for improvement there.

Best Regards,

Link to comment
Share on other sites

  • 0
On 2/26/2018 at 9:48 PM, B00ze said:

It's clearly maximizing the use of ONE disk, and clearly it is trying to limit going from disk to disk, but "disk context switches" taking any time at all is not something I am familiar with.

If it's using the one disk primarily, it may be that the other disks are being seen as "busy", and is avoiding using them, so avoid slowing down the system. 

On 2/26/2018 at 9:48 PM, B00ze said:

If you want to read-strip, what you do is allocate a big buffer and ask disk #1 to fill it. You do not wait for this to finish, you immediately allocate another buffer and ask disk #2 to fill it, starting further along the file.

Yes and no.  The driver operates in kernel space, so "minimal" is best, usually.  The more it's doing, storing or buffering, the more it can aversely affect system performance, and cause issues. 

Also ....

On 2/26/2018 at 9:48 PM, B00ze said:

If both disks are as fast then you should be maximizing disk context switches, not trying to minimize them. Each time a block finishes you can calculate how long it took and figure out which disk is faster and adjust accordingly (if disk #1 is faster, you ask it for the next block, and ask for the THIRD block out from disk #2, meanwhile disk #1 returns and you ask it for the block in between.)

Fast isn't the issue here, "busy" is.  It's the IO load on the disk, not the speed of the drive that the software is looking at. So, yes, this can create some discrepancies. 

And Windows is pretty sensitive to IO.  So we don't want to overload a disk, or even heap on too much work onto a single disk, as it can (will) affect performance.

On 2/26/2018 at 9:48 PM, B00ze said:

But my disks are all the same; if DrivePool did striping optimally I should see 50% I/O on each disk. Instead what I see is 90% I/O on one disk.

Only if NOTHING else is going on with the disks.  But if there is activity on the other disks, it may actively avoid using those disks.  As above.

Also, StableBit DrivePool does have priority classes for the different kinds of disks.  And that may influence things.  (.866 for details)

On 2/26/2018 at 9:48 PM, B00ze said:

Read Stripping is not as important as everything else DrivePool does, but there is clearly room for improvement there.

There is always room for improvement and optimization.  Especially since things have changed drastically since the last release. 

But I'll flag Alex (the Developer) about this, and see if he can/will post more information about how the feature works, and some of it's caveats/shortcomings. 

Link to comment
Share on other sites

  • 0

Hi Christopher.

I'll continue the discussion, but read-striping is not the most crucial thing DrivePool does. And as it has to handle a variable number of disks, variable speeds (e.g. USB) and variable I/O requests (e.g. random access vs sequential) it's understandable that it is complicated and not 100% optimal (which would yield around 2x the speed of a single drive). However, it's far from optimal right now; it concentrates all I/O on a single disk, I'm not quite sure why...

6 hours ago, Christopher (Drashna) said:

If it's using the one disk primarily, it may be that the other disks are being seen as "busy", and is avoiding using them, so avoid slowing down the system. 

When I do my quick simple little tests, the only thing the disks are doing is the file-copy I run for the test. If it sees disk #2 as busy then it sure sees disk #1 as busy as well. For optimal performance you should keep both disks as busy as you can. If there's a disk that's sitting idle 90% of the time then you're losing performance.Disk I/O is slow, and it doesn't need the CPU (ie. DMA,) so there is plenty of time to do things even when multiple disks read at maximum speed. USB or network storage are different (USB on my very old Pentium 4 system does take noticeable CPU time). But then, DrivePool would be conservative using disk #1 as well, if it was afraid of taking too much resources, and it's not, it's very happy keeping that disk as busy as it can. And look at the graph (screenshot) I attached, the disks never really run at the same time, it uses each disk by itself for large amounts of time, there are big "holes" in the curves...

6 hours ago, Christopher (Drashna) said:

The driver operates in kernel space, so "minimal" is best, usually.  The more it's doing, storing or buffering, the more it can aversely affect system performance, and cause issues. 

Sure, don't allocate a 1GB buffer for each disk, but you gotta keep them busy. Besides, for I/O like disks, what happens is you allocate the buffer, request a DMA transfer, and go to sleep. Your own thread is not actually taking any system resource, you're just mostly always sleeping waiting for I/O to complete.

6 hours ago, Christopher (Drashna) said:

Fast isn't the issue here, "busy" is.  It's the IO load on the disk, not the speed of the drive that the software is looking at. So, yes, this can create some discrepancies. And Windows is pretty sensitive to IO.  So we don't want to overload a disk, or even heap on too much work onto a single disk, as it can (will) affect performance.

You should be able to run multiple copies from multiple disks to multiple disks, and then run Prime95 and see only marginal impact, since it's all DMA. For USB, you would need many many disks connected to many different USB hubs to have a large impact on system performance, and even then, I'm pretty sure the impact would be minimal (it all runs on the one PCI Express bus, so there is only so much load you can achieve). 10GB networks will slow down a system (a lot, because of constant interrupts) but you would need a LOT of disks to get there.

6 hours ago, Christopher (Drashna) said:

Only if NOTHING else is going on with the disks.  But if there is activity on the other disks, it may actively avoid using those disks. As above. Also, StableBit DrivePool does have priority classes for the different kinds of disks.  And that may influence things.  (.866 for details)

Nice that is has priority classes (it did cross my mind how you handle real disks vs USB drives vs network drives). But both disks are the same class. And they are both sitting idle besides running the copy operation.

If Alex can explain that would be nice, but maybe there is a little bug somewhere too? I don't know, it's really strange that DrivePool just doesn't use the additional disks, it concentrates all I/O to the one disk. If I copy multiple big files, the disk that gets 90% of the I/O changes between files, but the other disk never sees more than 10% usage.

Best Regards,

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...