Jump to content
  • 0

Deduplication + DrivePool = NO GO, since Windows 16299 (1709, Fall Creators)


t-s

Question

Hi

I' use the MS deduplication and DrivePool both since the beginning.

I never had any major problem with them, aside some calculation glitches with older DP releases.

Now I'm in troubles.

The deduplication of Windows has changed a bit since the build 16299 or so.

A disk previously deduplicated with Win10 or Win server 2016, after being touched by a recent W10/Server 1803/1809/Server 2019, will be silently upgraded and the deduped files become inaccessible by older incarnations. So downgrade/dual boot is not an option.

DrivePool (I tested even the latest beta) will fight with the dedup filter and while DP itself still works, the deduplication does not.

Before you ask,  yes the option to skip the FS filter is correctly set, as usual, in my case


This is what I get from powershell with a simple

Get-dedupstatus  (after DP installation and a reboot)

Quote

Get-DedupStatus : Si è verificato un errore generico che non è compreso in alcun codice di errore più specifico.
In riga:1 car:1
+ Get-DedupStatus
+ ~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (MSFT_DedupVolumeStatus:ROOT/Microsoft/...dupVolumeStatus) [Get-DedupStatu
   s], CimException
    + FullyQualifiedErrorId : HRESULT 0x80010108,Get-DedupStatus



This Is what I see in the deduplication section of event log

Quote

 

Errore di Deduplicazione dati: errore imprevisto.


Operazione:
   Avvio del servizio Deduplicazione file server in corso.

Dettagli errore:
   Errore: DeviceIoControl (FSCTL_GET_NTFS_VOLUME_DATA), 0x80070001, Funzione non corretta.

 

Sorry messages are in Italian but I think are still pretty clear.


So please fix DP asap to not ruin the good reputation of your product (and also to make my life easier :D )


P.S. FYI Drive Bender 2800 has the exact same behavior

Link to comment
Share on other sites

17 answers to this question

Recommended Posts

  • 0

I am a bit confused as to what your I/O stack looks like, can you draw a little map in paint or something so we know what is going on?

Also a bit confused as to why your using MS deduplication instead of the duplication from DrivePool. One of the main benefits of DrivePool is the duplication feature which eliminates problems such as the exact one your having. 

Link to comment
Share on other sites

  • 0
36 minutes ago, zeroibis said:

I am a bit confused as to what your I/O stack looks like, can you draw a little map in paint or something so we know what is going on?

 

Map in paint? O_o

They are two disks deduplicated with the MS deduplication and made resilient by driveppol

Quote

Also a bit confused as to why your using MS deduplication instead of the duplication from DrivePool.

I understand that inexperienced people will get confused by the similarity of the terms used: dupliction/deduplication by very different SW.

It sounds like a joke, but it isn't!!!

 

MS deduplication can halve or even decimate the storage usage....

Think to an office with 10 client PCs, each started with the same installation.

Their VHDXs could be 40 GB, so they are 400GB in theory. In pratice 90% of the data are the same for for each PC, so your 400GB can be stored in just 80GB or so.

This is what the deduplication does . 

Drivepool does something of very different. It takes your 400 GB and duplicates them across two or more drivers (for resiliency).

Using deduplication + duplication makes the storage both efficient and resilient: You will end (according to the above example) with 80x2 GB of data instead of the basic situation of 400x1 GB.

A huge win/win. Great security, great efficiency. 

 

In short duplication means  x2

 

deduplication means :2  to :10 or more, combine both of them and you will be the happiest IT user in the world :)
 

Link to comment
Share on other sites

  • 0

Ah I understand and yes I had never heard of deduplication before and assumed it was a block level duplication service via storage spaces. I understand now that it is a form of compression. I would presume that the decompression would have a decent performance penalty but it sounds pretty great if you do not need a lot of read performance on old random files.

So you have two disks each with deduplication and then both of these same disks added to the same drive pool with real time duplication enabled. 

I am guessing that MS has changed some APIs with regards to the way you access data from the volume that is effecting drivepools ability to maintain duplicates.

A work around could be to temporarily real time duplication and instead have data flow to one drive and then to the second as an archive.

Link to comment
Share on other sites

  • 0

It's indeed a form of block deduplication but is built on top of the NTFS, not under it, just like DP.

And no is not strictly a form of compression.

MS deduplication does also compress the files, but that's an additional and optional feature.

Any file is written on the disk in chunks, with the deduplication enabled the chunks in common with two or more files are used multiple times, so there isn't any major overhead. At least if you don't enable the compression.

But even with the compression enabled  the performances are more than good enough for a storage disk.

 

Quote

A work around could be to temporarily real time duplication and instead have data flow to one drive and then to the second as an archive.

No, good suggestion but it it doesn't work

After a day of digging I realized that putting offline the pool disk/s, (but obviously not its/their members) via the disk manager, is enough to get the deduplication services working again.

Probably the problem lies in the new dedup.sys (or whatever dedup component) that tries to enumerate the pool disk as a real disk, and fails miserably.

Very likely, strictly speaking, the fault is from MS, not Covecube, but I'm afraid that MS couldn't care less of that problem.

As usual MS creates a problem and is matter of third parties to deal with it

The good news is that no data corruption happens, it's just a major annoyance.

Link to comment
Share on other sites

  • 0

Interesting, also thanks for updating with the solution you found.

Is there any material you would recommend to read up on the deduplication service. It appears this is a great solution if you have a lot of files that are only accessed but not often modified, such as large sets of videos and photos. I would still imagine that there is a heavy processing and memory cost to this that my present system likely can not pay but it is something to read up on and look into for future upgrades.

Link to comment
Share on other sites

  • 0
46 minutes ago, zeroibis said:

 

Quote

Is there any material you would recommend to read up on the deduplication service.

No, I don't have any specific source, there are a number of post on microsoft blogs you will find easily with a simple google search

 

Quote

It appears this is a great solution if you have a lot of files that are only accessed but not often modified, such as large sets of videos and photos.

No photos and videos are absolutely the worst candidate: they are already compressed and they lacks any similarity. Maybe RAW uncompressed photos can be worth to be deduplicated

Best candidates are ISO, vhd, vmdk, work folders like "say" multiple revisions of Android roms.

Think to Win server 2019 ISO and Win server 2019 essentials ISO, they are 4.4GB + 4.17 GB. On a standard disk they will take 8.57.GB, on a deduplicated disk they will take 4.8GB or so
Add 2GB of Hyper-V server 2019 and you will use 33MB (yes, megabyte) more...

 

Quote

I would still imagine that there is a heavy processing and memory cost to this that my present system likely can not pay but it is something to read up on and look into for future upgrades.

The heavy part is the first run, then only the new files added/modified will be deduped (each couple of hours or so), in background nothing that a decent PC even 5/7 years old can't handle

Then the native ZFS deduplication is very  efficient (it's  done natively and in real time by the filesystem) but very memory hungry (about 8GB of ram per TB of deduplicated storage is needed), the MS flavour isn't so sophisticated but also it's not that demanding: less than 4GB of ram are enough to deduplicate a couple of TB of data.

 

Link to comment
Share on other sites

  • 0

Are you trying to enable deduplication on the Pool itself, or the underlying disks?

 

Also, it may be hard to test this, as the 2019 ISO's aren't back up yet, I believe. 

 

Also, make sure that the "bypass file system filters" option in StableBit DrivePool is disabled.  If it's enabled, it will break deduplication

Link to comment
Share on other sites

  • 0
Quote

Are you trying to enable deduplication on the Pool itself, or the underlying disks?

Obviously the underlying disks, like I said I use both the deduplication and DP since the days of Win8/Server2012, so i know how to deal with it, I'm anything but a newbie about that matter.

Quote

Also, make sure that the "bypass file system filters" option in StableBit DrivePool is disabled.  If it's enabled, it will break deduplication

See the previous reply. (anyway, as you know, DP has that option disabled by default on recent releases.)

Quote

Also, it may be hard to test this, as the 2019 ISO's aren't back up yet, I believe. 

2019 is already available, there are the 3 months evaluation ISOs, but you can also install the deduplication cabs on any W10 x64 1809 (obviously same story as server 2019 and Server 1809, tested personally)

P.S. I know that the resurces are always limited but the prerelease builds are released by MS for a reason, my suggestion is to test your products on them, to be ready when a new "final" relase is launched.

P.S.2 Also WSL was vastly reworked since the days of server 2016 (where it was installable only unofficially), I know it uses some kind of FS filters as well and I smell problems also from that direction (I had a reboot  during the linux update, but since then I didn't have much time to investigate further.
 

Edited by Christopher (Drashna)
removed link
Link to comment
Share on other sites

  • 0
21 hours ago, t-s said:

Obviously the underlying disks, like I said I use both the deduplication and DP since the days of Win8/Server2012, so i know how to deal with it, I'm anything but a newbie about that matter.

Okay, I just wanted to make sure! :)

21 hours ago, t-s said:

2019 is already available, there are the 3 months evaluation ISOs, but you can also install the deduplication cabs on any W10 x64 1809 (obviously same story as server 2019 and Server 1809, tested personally)

Well, Microsoft pulled them because of the profile purge issue.  I don't see the ISO, at all. 

And yeah, I know about the hack. 

21 hours ago, t-s said:

 P.S. I know that the resurces are always limited but the prerelease builds are released by MS for a reason, my suggestion is to test your products on them, to be ready when a new "final" relase is launched.

Yup, we plan on it.  We're just waiting for Microsoft to re-release the ISOs so we can do testing. 

21 hours ago, t-s said:

P.S.2 Also WSL was vastly reworked since the days of server 2016 (where it was installable only unofficially), I know it uses some kind of FS filters as well and I smell problems also from that direction (I had a reboot  during the linux update, but since then I didn't have much time to investigate further.

Ah, yeah, I can see that being an issue, potentially. 

 

Link to comment
Share on other sites

  • 0
7 minutes ago, Christopher (Drashna) said:

Well, Microsoft pulled them because of the profile purge issue.  I don't see the ISO, at all.

The purge issue is already fixed by the update KB4464330 (which moves the build from 17763.1 to 17763.55) obviously it is the same for both Win Server 2019, Win 10 1809 and Win Server 1809.

So the EVAL ISO, (which are still there) can be safely downloaded and upgraded (especially for a test machine where you have nothing to loose)

Eval Standard/Datacenter (English US)

https://software-download.microsoft.com/download/pr/17763.1.180914-1434.rs5_release_SERVER_EVAL_X64FRE_EN-US.ISO

And the mentioned cumulative update

http://download.windowsupdate.com/d/msdownload/update/software/secu/2018/10/windows10.0-kb4464330-x64_e459c6bc38265737fe126d589993c325125dd35a.msu

 

Quote

And yeah, I know about the hack. 

Well calling it may be misleading, they are the official signed package which are missing from W10, Server Essentials and the (free) Hypercore (for the record both DP and deduplication worked well on Hypercore 2012/2016 I used it to provide the [de]duplication to WHS)
 

Quote

We're just waiting for Microsoft to re-release the ISOs so we can do testing.  

No need to wait anymore. ;)
 

Link to comment
Share on other sites

  • 0

Unfortunately, the eval center still doesn't give a link to the ISO.

And while the ISO may still be at the link you've posted, it's not a publically available one. 

On 10/18/2018 at 11:13 AM, t-s said:

Well calling it may be misleading, they are the official signed package which are missing from W10, Server Essentials and the (free) Hypercore (for the record both DP and deduplication worked well on Hypercore 2012/2016 I used it to provide the [de]duplication to WHS)

Disagree. 

Running it on any system that it is not included with already, or LICENSED for is a violation of the licensing, to start with.  Not to mention, it's not designed nor testing on these other configurations, and may cause issues. 

Additionally, for the most part, people post links to the files, which are copyrighted, and may be a copyright violation, as well.  

So, yeah "hack" is very accurate here. 

 

Link to comment
Share on other sites

  • 0
45 minutes ago, Christopher (Drashna) said:

Unfortunately, the eval center still doesn't give a link to the ISO.And while the ISO may still be at the link you've posted, it's not a publically available one. 

 

So? You are really saying that after a user discovered a bug that makes unusable (for the average user) your product, on an already released product, you refuse to test it just because a publicly available link is removed from a single MS page O_o.

That's masochism.

I spent hours trying to bisect the problem (and I shared my findings), I spent more time to collect the links you need to test safely YOUR product and you just want to sit and wait that a MS webmaster re-add that link to that page?

I'm sorry but that sounds a bit incredible to me.

Quote

Running it on any system that it is not included with already, or LICENSED for is a violation of the licensing, to start with.  Not to mention, it's not designed nor testing on these other configurations, and may cause issues. 

Taking any rule just like a Jehovah Witness takes the Bible has never been a good idea.

W10 And Win Server are the same OS (assuming we are talking about the same build, and we are), few things are allowed or disallowed because the kernel policies (e.g. the maximum RAM available, or the ability to launch DCpromo.exe) some things are just added or removed packages (just like the dedup packages), in that specific case there is absolutely no difference between Win server 2019 and W10 1809.
There isn't a single hacked/changed bit, there isn't any circumvention of a software protection, there isn't any safety risk.


From a legal point you are right, no one would theoretically be allowed to use even Notepad.exe taken from win 7 on win 8, but you really should be a bit flexible on that (just like MS is since the DOS days).

It's not the case here (given the server ISO is there), but assuming you're are going to test your product on W10 + the "hacked" packages, what would be the MS position or that?

Do  you think they feel your're "stealing" something? Or maybe they consider that action something meant to improve the OS usefulness and the user experience?

Link to comment
Share on other sites

  • 0

Just to add something relevant...

I tested the same setup on windows server and win 10 1709 and the problem is already present

As I wrote on my first post, the dedupication changed since the (publicly released) version 1709, so was too easy to guess that the problem with drivepool started there.

I didnt use such versions so I cannot corfirm the problem on first post.

Now I spent some time installing Windows Server 1709 on my test machine and I can confirm the problem.

So there isn't any excuse anymore, DP on deduplicated disks is broken since more than one year!!!

I guess that no one noticed that because Server 1709 (and 1803) comes only as GUIless flavours and very few SOHO users relied on them.

 

 

Link to comment
Share on other sites

  • 0
On 10/19/2018 at 2:02 PM, t-s said:

So? You are really saying that after a user discovered a bug that makes unusable (for the average user) your product, on an already released product, you refuse to test it just because a publicly available link is removed from a single MS page O_o.

No. We're saying that we don't want to test it on unreleased, non-final code.  

If you haven't followed the news, there have been several other bugs discovered that have delayed release of the code, which MAY NOT have been patched yet.  

So testing on unreleased, non-final code is true masochism. 

On 10/19/2018 at 2:02 PM, t-s said:

Taking any rule just like a Jehovah Witness takes the Bible has never been a good idea.

W10 And Win Server are the same OS (assuming we are talking about the same build, and we are), few things are allowed or disallowed because the kernel policies (e.g. the maximum RAM available, or the ability to launch DCpromo.exe) some things are just added or removed packages (just like the dedup packages), in that specific case there is absolutely no difference between Win server 2019 and W10 1809.
 There isn't a single hacked/changed bit, there isn't any circumvention of a software protection, there isn't any safety risk.

Licensing matters. Period. End of discussion. 

You can violate the license all you want.  We won't, nor can we risk doing so.  Period. End of discussion. 

And if you don't think that Microsoft Lawyers will argue "word of the law".... that's naive.  And comparing law to religious text is a bad idea, for a LOT of reasons.  And I shouldn't have to point that out. 

On 10/26/2018 at 1:54 PM, t-s said:

I tested the same setup on windows server and win 10 1709 and the problem is already present

 As I wrote on my first post, the dedupication changed since the (publicly released) version 1709, so was too easy to guess that the problem with drivepool started there.

 I didnt use such versions so I cannot corfirm the problem on first post.

Now I spent some time installing Windows Server 1709 on my test machine and I can confirm the problem.

So there isn't any excuse anymore, DP on deduplicated disks is broken since more than one year!!!

I guess that no one noticed that because Server 1709 (and 1803) comes only as GUIless flavours and very few SOHO users relied on them.

I've tested it myself, several times, and dedup has worked up to 1803.  So if it's not working now, it's likely due to a patch that change the functionality.

 

And I know that we have a number of users that are actively using deduplication without issues.  So, to say that it's been broken for more than a year would appear to be factually incorrect.

That said, I'm locking this thread, as it's stopped being helpful/productive/etc. 

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
×
×
  • Create New...