Jump to content
  • 0

Unduplicated files with Pool file duplication


Umfriend

Question

Hi,

 

So I had to rearrange storage a bit and currently it is like this:

2 x 2TB HDDs

1 x 4TB HDD, partitioned as 2 x 2TB volumes.

Pool file duplication x2 (no folder duplication)

DrivePool version 2.1.1.561, only the default balancers and all at default values.

OS is WHS 2011

 

Statistics show 17.8MB as Unduplicated. I do not understand why any data should not be duplicated.

 

I would have thought all would be duplicated and that I could be assured that any file would at least be stored on one of the two 2TB HDDs (as it should not duplicated on 2 x 2TBB volumes of the single 4 TB HDD only).

 

I have tried re-measuring.

Link to comment
Share on other sites

Recommended Posts

  • 0

Try installing the latest beta build.

http://dl.covecube.com/DrivePoolWindows/beta/download/StableBit.DrivePool_2.2.0.659_x64_BETA.exe

 

 

This may resolve the issue outright. If not, try remeasuring the pool on the new version to see if that helps.

 

If not, well, the reason I linked the specific build is that we've added some command line based auditing tools to the newest build. This may help identify the files in question:

http://community.covecube.com/index.php?/topic/1587-check-pool-fileparts/

 

But chances are, the problem files may be in the "System Volume Information" folder in the root of the drive. Deleting that folder may fix the issue (you'll need to enable "Show hidden files", and disable the "hide protected operating system files" options; take ownership of the folder and all child entries, and change the permissions, before you can delete it).

 

 

Also, opening the "Folder Duplication" option will enumerate the folder contents in real-time, which may help to identify where this is happening.

Link to comment
Share on other sites

  • 0

But then I thought **** it!, these guys don't put out beta's that easily and for now, yep, unduplicated is gone. But now something else seems weird to me:

1. The two 2TB HDDs (E:\ and F:\) show 985GB and 1004GB duplicated (1,989GB)

2. The two 2TB volumes (G:\ and H:\) on the 4TB HDD show 1,016 and 1,018 duplicated (2,034).

 

This seems odd to me as the 4TB HDD should, at most, have one duplicate/copy only. It could be zero as the two duplicates could be on E:\ and F:\. So I would expect the sum of duplicated for E:\ and F:\ to be equal or greater than G:\ and H:\.

 

Ran dpcmd (nice tool although the output structure might be improved somewhat as others have indicated. For instance to be able to import into excel or a database) and it showed no errors so that is nico but still...

Link to comment
Share on other sites

  • 0

Well, where the duplicate files are located is rather complicated.  Or I think the best phrasing is that it's a "shotgun pattern".

 

Each disk may not have an equal amount, as files may be placed on E: and G:, instead of G: and H:.  And it also depends on the balancing settings, as well.

 

That, and something else to keep in mind: when placing duplicate files, the software will NOT use the same physical disk for the duplicates, as that doesn't make sense from a (re)liability standpoint.  So that could definitely cause the discrepancy. 

 

 

As for the output, yeah, it could be better. But Alex quickly whipped it up.  I'm sure he will refine it later, or .... something. :)

But it's something, and it's fairly useful as is (depending on what detail level you use.... 2 or 3 for you).

Link to comment
Share on other sites

  • 0

But that is my point, unless I am missing the obvious: One file should never be stored on G:\ _and_ on H:\. A file _may_ be stored on E:\ and F:\. So, total data on E:\ and F:\ should be GE then G:\ and H:\ (because if one dupl;icate of a file is stored on either G or H then the other duplicate MUST be stored on E or F). So, I am looking at E+F (2 HDDs) compared to G+H (2 partitions on ONE HDD).

Link to comment
Share on other sites

  • 0

Ah, okay. Sorry, I missed that.  And you're right, the duplication data for the E:\ and F:\ drives combined should always be more than the G:\ and H:\ drives.

 

And that it's not is ... unusual.

 

If you want, run the dpcmd, "pipe" it to a text file, and upload that me.

Use the form at the bottom of the linked page:

http://wiki.covecube.com/StableBit_DrivePool_2.x_Log_Collection

Link to comment
Share on other sites

  • 0

I'm not sure why I didn't think of this .....

 

Are all of your drives formatted the same way? Eg, using the same allocation unit size? 

 

And from a quick look, it doesn't look like there are any issues.... aside from one storageconfig xml file, but that's normal. 

 

I've flagged for Alex, anyways, just in case.

https://stablebit.com/Admin/IssueAnalysis/22889

Link to comment
Share on other sites

  • 0

OK, I am sorry but I think I have a real issue/problem.bug here. The short of it is, duplication does not appear to guarantee storing duplicates on different physical HDDs. I am sure aI am doing something terribly wrong but I would really like some help here. Basically, my current Server Backups are worthless.

 

So let'start with the first picture. It is Diskmgmt and it shows Disk 1 (E:\), Disk 2 (F:\) and Disk 5 (G:\ and H:\). The first two are two 2TB HDDs, the latter a single 4 TB HDD split into two even partitions (save for a MB or 10 max).

 

The second pic shows DP. These four (E:\ to H:\) comprise the Pool, it has x2 duplication. Already you see the duplicated bars on G:\ and H:\ being somewhat longer than on E:\ and F:\.

 

So I decided, finally, to look at the properties of the volumes and found:

HDD Volume Files 1 E:\

 145,255.00

2 F:\   112,656.00 3 G:\     18,484.00 3 H:\     37,137.00

 

That can't be right.

 

The third pics show what explorer shows on Client Backups, which are part of the Pool for the four poolpart folders. Look at the file Commit.dat. It is on G:\ and H:\ and that is the same physical HDD. Same for Data.4096.1.dat, number 3 and 4 as well.

 

Also, dpcmd shows this for Commit.dat:

    - [2x/2x] P:\ServerFolders\Client Computer Backups\Commit.dat (5.50 KB - 5,632 B)
      -> \Device\HarddiskVolume10\PoolPart.ac08c794-b981-42eb-b05b-3994ade7f64b\ServerFolders\Client Computer Backups\Commit.dat [Device 5]
      -> \Device\HarddiskVolume11\PoolPart.d603fe42-65b1-4dae-89b0-ff9e12b2293c\ServerFolders\Client Computer Backups\Commit.dat [Device 5]

 

I have no File Placement Rules, the default balancers with default settings, pool organisation is 100%. I am pretty sure that when I changed from a 2x2x2TB setup (2 pools of 2x2TB HDDs) to a 4TB+2x2TB HDD I had duplication off and turned it on again during the transition. This transition basically was adding the 4TB HDD to the Pool that consisted of E:\ and F:\ and then moving the Client Backups from the other Pool to this one. It may have been the case that, in order to duplicate these during the transfer, DP would have had to move existing files from E:\ and F:\ to G:\ and H:\ to make room but apparantly that did not happen.

 

HELP! How can I fix this? I _need_ at least one duplicate of each file to be on either E:\ or F:\ (and expected default behaviour to do that for me).

 

I am running WHS2011 and DP 2.2.0.659 BETA.

post-1414-0-18476000-1454489404_thumb.png

post-1414-0-55973200-1454489600_thumb.png

post-1414-0-57856700-1454489932_thumb.png

post-1414-0-85961600-1454489938_thumb.png

post-1414-0-40024100-1454489945_thumb.png

post-1414-0-74244400-1454489955_thumb.png

Link to comment
Share on other sites

  • 0

To make sure, is the "Duplication Space Optimizer" balancer enabled?

It should be by default, but just to double check.

 

 

Could you grab the logs from the system? 

http://wiki.covecube.com/StableBit_DrivePool_2.x_Log_Collection

Just steps #7-9

 

And could you run "dpcmd check-pool-fileparts "X:\ServerFolders\Client Computer Backups\" 4 > log.txt" and upload that as well? Either using the above form or via https://stablebit.com/Contact

 

And to make sure, is the pool bar at the bottom at "1000%" (eg, completely full)

Link to comment
Share on other sites

  • 0

- Duplication Space Optimiser (DSO) is enabled, fifth in list (Scanner, Volume Equalisation, Drive Usage Limiter, Prevent Drive Overfill come first).

- Pool is at 100% (green, complete width, not sure why it does not also give a percentage, I like numbers better than graphicals).

 

Service logs are uploaded and the dpcmd output from yesterday is in the same zip file. One of the issues I now have with dpcmd is that even though x2 duplicates exist, it should (IMHO) flag when duplicates are stored on the same device as an error. Would like to have confirmed that it will do that in a future release

 

I have moved DSO to 2nd place and started a remeasure, just to see if it will do anythin but I doubt it. Will let you know.

Link to comment
Share on other sites

  • 0

Nothing changed by the move of DSO...I am really a bit stuck with this. Not only will I actually lose data if HDD 5 fails, my Server Backups do not include that HDD either so it would be really gone!

 

I can think of various things to do but, well, without a current backup I'd rather not try just anything.

Link to comment
Share on other sites

  • 0

Worst case, you can manually move the files from one Poolpart folder to another.

 

But you shouldn't need to do this, this should be handled by our software, automatically.

 

 

 

I've bumped the issue to "critical", as this definitely affects data integrity. 

 

 

 

In the meanwhile, just in case it helps (i'm not sure it will, but ...), in the balancing settings, move the slider to "100%" and check the "or if X amount needs to be moved" (not the exact wording, but the checkbox under the slider), and set the value to 1GB.

 

 

Something else that may "help" is remeasuring the pool (which I'm pretty sure you've done already), or maybe enabling x3 duplication and disabling it (but I'm not sure this will work). 

Link to comment
Share on other sites

  • 0

Well the 100%/at least 1GB trick did not do anything. Remeasuring I have tried.

 

I could do x3 duplication (or x1 duplication which I am pretty sure I did early January). I have considered adding 2x2TB HDDs to the Pool, removing the 4TBB HDD, re-adding it and then removing the 2x2TB etc.

 

However even if the result may look better, there is no certainty now that the integrity of duplication is guaranteed. I would like a version of dpcmd checkpool-fileparts where duplicates stored on the same device is flagged as an error as well. And hopefully you guys find a bug that can be solved so that we at least understand how what I see can be.

 

Meanwhile, I am now backing up all Pool volumes so basically twice the files I want to backup.. :( Can't risk data loss.

 

Edit: If you look at the earlier DP picture (2nd picture of post #12), the is a dark blue pointer/triangle at the beginning of volume H:\. The text says: :"Un-duplicated target for re-balancing (14.3 GB)". I find it a cryptic message but I am sure you understand what it means.

Link to comment
Share on other sites

  • 0

Okay, I didn't think it would, but better to  try and fail than not try.

 

As for the x3 duplication, this would solve the issue (by brute force) by essentially forcing a copy to be a on a different physical disk. But that's not a good solution. We need/want to fix the issue, not work around it. 

 

As for the files, I saw a bunch of files there were doubled on "device 5", so that's definitely not good. 

 

 

 

And the issue is marked as critical, so Alex will/should get to it soon.  However, I know he's doing a lot of backend stuff right now, but I'll message him directly about this, just to expedite it.

Link to comment
Share on other sites

  • 0

I am just wondering, suppose I seeded the G:\ and H:\ poolpart folders by copying duplicates to two volumes on the same HDD. Should DP then see that the file is duplicated but on the same device and correct this? I hope so but I can imagine this to be a bit out of spec.

 

To be sure, I _never_ seed poolparts, I always go through the regular Pool (exactly becuase I want to be sure the files are handled as I think DP does/should).

Link to comment
Share on other sites

  • 0

I believe so, but I'd have to double check with Alex. 

 

And as long as you remeasure the pool after moving files around inside the PoolPart folders, you should be okay. But yes, it's best to use the Pool drive.

 

In this case, manually moving files around may be recommended...

 

However, I'm not sure why I didn't think of it before, but you *could* use the "Drive Usage Limiter" balancer to forcibly move the files off of one of the volumes, and then let it rebalance normally.

Link to comment
Share on other sites

  • 0

Yes, that is probably another way of doing this but then how would I be certain the issue never occurs again? Moreover, moving files like this will create havoc with my Server Backup (I do not think it de-duplicates across volumes) and I think I would prefer to have a definite diagnose and solution first, implement once and initialise backups anew then. Somehow I have this feeling it may be an issue that was introduced with the File Placement Rules functionality (but then, what do I know, I am biased here). It may also not be a bug but rather according to spec, which I would really like to know as well.

 

So I guess I will await Alex' analysis.

Link to comment
Share on other sites

  • 0

Ah, yeah, i didn't think of that. yes, it would definitely affect the Server Backup, and no, it definitely doesn't support deduplication. So enabling duplication would drastically increase the amount of disk space used/needed. 

 

 

 

And you are using File Placement Rules?

If so, that *could* contribute to the issue.
How do you have these set up, specifically (if you do).  

Link to comment
Share on other sites

  • 0
I just wanted to let you know that Alex is actively looking into the issue.  

 

He's updated the specific ticket with information (privately), and plans on looking into the issue and preventing it from happening in the future.

 

I can't really give a timeline on this, especially as this is a pretty serious issue, and will likely require a serious in depth audit of our code to fix. Both the driver and the service. So that's ... a lot of code to go through. 

 

But this is definitely a top priority as it's a nasty bug.

Link to comment
Share on other sites

  • 0

Can I see what he writes (as it is on _my_ ticket ;-) )?

 

Anyway, I do hope it is not something silly I did that is causing you guys to work on this. Thanks for the update. I am not changing anything so if I can provide more info from the system or similar just let me know.

 

Edit: But yeah, the behaviour certainly, to me, seems out-of-spec, right?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...