Jump to content
  • 0

Slow copy Speeds, 10G Lan, 2012R2E and Disk Write Cache


Spider99

Question

Mainly for information as not a Drivepool problem but it initially looks like it is when first experienced.

 

I was getting slow copy performance to the pool via my 10G network - basically it would run fine at close to 500mb/s (sata ssd to sata ssd) about as fast as it can go with the disks involved. This would continue for a minute or so then the copy would slow down to 20-30mb/s or in some cases virtually zero - it was during a copy of a +200gb backup file so not a small file problem which i initially thought it might be. With further testing i found it would happen on virtually any file with a reasonable size so the file size is not the contributing factor although it does show the problem more fully the larger the file being copied is.

 

One thought was that something was overheating and being throttled - checked Scanner no disk was above 35c. Sending pc also had low temps as well. Also checked the 10G network cards and they were showing no errors and normal temps according to the driver info.

 

The server with DP installed was very sluggish to respond when the slow copy occurred but memory load was at most 50% on a 32GB system - hmmmm odd

 

When monitoring a copy more closely on the server i could see the "modified" memory in Task manager climb as a large copy started and would peak out at about 12GB at which point the copy would slow down to an effective crawl. Over time (few minutes) this would decrease back to a lower value and then the copy would speed up again but would drop to a crawl when the "modified" memory peaked again.

 

So i checked the write cache settings on all drives and found all but one had write cache enabled - odd why would one not have it checked - well it turns out that write cache is disabled on the system drive of a domain controller by default. So after enabling it the copy speed went back to normal.

 

However this does not survive a reboot as windows 20212 r2 Essentials must check this value and turn it off. - Grrr

 

So to prevent slow copies on your network (occurs on 1g and 10g Lan) dont include your OS drive or any partition (my mistake) of the drive in your pool especially if its an ssd with SSD Optimizer as when files get written to that drive the performance slows to a crawl and will speed up again when copying to other ssd's - so you get very variable copy performance and a very sluggish pc until the memory is cleared out.

 

I am experimenting with an option to modify the memory cache size in 2012 r2 to see if it makes any difference to the performance. As i can see that on a 10g network the performance does still decrease (with write cache enabled) when the memory cache is filled to approx 50% of normal copy speed - this might be related to DP doing real time duplication but not sure yet. Will report back when i understand more :)

 

Hope this helps somebody with similar issues :)

Link to comment
Share on other sites

6 answers to this question

Recommended Posts

  • 0

Well, to clarify, write caching is only disabled on the drive with the SYSVOL and NETLOGON folders.. However, on Essentials, this is defaulted to the system drive. 

 

This is done to specifically prevent write errors due to sudden power loss, as corrupting these files can/will destroy the domain. 

 

 

There is a way to "hack" this to be enabled every time you boot, but I wouldn't recommend it anyways. 

 

 

And this is one of the reasons i recommend not using the system drive as part of the pool.   That and IO load, and "what happens if the drive fails". 

 

But yeah, it sucks. 

 

 

 

Also, another thing that may help is running "netsh int tcp set global autotuninglevel=highlyrestricted" from an elevated command prompt on both systems. 

Link to comment
Share on other sites

  • 0

Well it is the same physical drive as the system but another partition - only added it as a replacement for a ssd that went walkies after a reboot - as that one came back i have since removed the D: drive from the pool

 

Strange thing is the server is on a ups but i suppose MS dont think thats enough - hmmmm

 

what does that command line do exactly ??

Link to comment
Share on other sites

  • 0

Well, the write caching is set per drive, not per volume/partition.  So, if it's using the system disk at all.... 

 

As for the problem SSD, as other thread, keep an eye on it. 

 

As for the UPS thing, you're missing a key bit of info here...  Consumer drives LIE about flushing data to the drive (syncing).

Write caching can compound this issue, where disabling write caching increases the likelihood that the data does get flushed/synced in a timely manner.

 

SQLite has a nice article about this:

https://www.sqlite.org/howtocorrupt.html

It's section 3 "Failure to sync". 

 

Unfortunately, most consumer-grade mass storage devices lie about syncing. Disk drives will report that content is safely on persistent media as soon as it reaches the track buffer and before actually being written to oxide. This makes the disk drives seem to operate faster (which is vitally important to the manufacturer so that they can show good benchmark numbers in trade magazines). And in fairness, the lie normally causes no harm, as long as there is no power loss or hard reset prior to the track buffer actually being written to oxide. But if a power loss or hard reset does occur, and if that results in content that was written after a sync reaching oxide while content written before the sync is still in a track buffer, then database corruption can occur.

 

 

So, really, Microsoft is still in the right on this one. 

 

 

But then again, this is why I highly recommend SSDs for the system drive in Essentials. :)

Link to comment
Share on other sites

  • 0

if you have a ups then the "lie" is irrelevant as the system will be protected and shut down before the battery is exhausted and flush the disks (assuming you set it up correctly) - if it did not then you would run the risk of loosing data every time you shut down

 

so not sure i understand the point you are making?

 

Happy New year by the way :) 

Link to comment
Share on other sites

  • 0

UPSs help, but power supplies still fail, and UPS units go bad as well. In a home lab/etc situation that's one thing, but domain controllers are rarely in that setting. I think it's an understandable decision to force write caching off for the disk holding AD schema. A corrupt domain can be a nightmare.

 

 

This.

 

And Microsoft's concern is enterprise, first and foremost. 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...