Jump to content
  • 0

Unduplicated files with Pool file duplication


Umfriend

Question

Hi,

 

So I had to rearrange storage a bit and currently it is like this:

2 x 2TB HDDs

1 x 4TB HDD, partitioned as 2 x 2TB volumes.

Pool file duplication x2 (no folder duplication)

DrivePool version 2.1.1.561, only the default balancers and all at default values.

OS is WHS 2011

 

Statistics show 17.8MB as Unduplicated. I do not understand why any data should not be duplicated.

 

I would have thought all would be duplicated and that I could be assured that any file would at least be stored on one of the two 2TB HDDs (as it should not duplicated on 2 x 2TBB volumes of the single 4 TB HDD only).

 

I have tried re-measuring.

Link to comment
Share on other sites

Recommended Posts

  • 0

It's not exactly "helpful" for anyone but us, as it's more of a checklist, right now. But will PM you.

 

 

As for "silly", well, it's an issue, as it's something that shouldn't be happening. It's something that absolutely needs to be addressed, because who knows many others have experienced it without noticing. 

 

And yes, it's definitely "out of spec".

Link to comment
Share on other sites

  • 0

Uhm, I am not sure what happened. I incidently logged into the Server (to change backup HDD) and checked DP. All of a sudden it is doing a huge duplication pass. No idea waht caused it. I will check the backup files like I did a few posts back when it is done.

 

Edit: I did not know this but there is a pop-up when mousing over the duplication bar and it says: "Working on the entire Pool {CRLF} Now processing \ServerFolders\Client Computer Backups.

 

Edit2: So while it was duplicating I ran a dpcmd again and it reported over 14K Inconsistent Files. It almost seems as if DP discarded _all_ duplicates and started to re-duplicate. Mind you, this was not user-initiated. Is this a strategy that can be employed if it does not find a way to achieve (multi-HDD) duplication in a simple manner? For instance, perhaps DP saw both duplicates on the 4TB HDD but could not find the space to move both duplicates to the two 2TB HDDs (but did know that it should have enough space to place all)? What is really strange is that something triggered after having been inconsisten for over a month of time...

 

Edit3: So once it was done (and I did do a Server Reboot as it was taking ages to build a bucket list), the end result is that that files are still duplicated on the 4TB HDD, even though quite a bit was moved around in the rebalancing. Moreover, even though Pool Organisation is 100%, there is now 179 KB unduplicated data. And dpcmd (I save all the runs seperately) does not show a [x1/x1] file and does not report duplication inconsistencies. Mind you, the reporting of Unduplicated was what started this thread.

 

Anything I can do or send over, let me know.

post-1414-0-74521500-1455283266_thumb.png

Link to comment
Share on other sites

  • 0

Well, aside from the file-placement issue, I now again have a small amount Unduplicated while I have x2 duplication on everything...

 

Where can I find good documentation on those file-placement limiters? I have one for instance that says: "Un-duplicated target for re-balancing (168KB)" and another that says the same but for -168KB.

Link to comment
Share on other sites

  • 0

Here:

http://stablebit.com/Support/DrivePool/2.X/Manual?Section=Disks%20List

 

And from the sounds of it, it looks like it wants to move it form one disk to another.

 

 

As for the measuring issue, this was a previously known issue that we do plan on looking into.  However, giving the other issues, Alex is planning on doing a full audit of the balancing, duplication and measuring code. This will be something that gets looked into at the same time. 

Link to comment
Share on other sites

  • 0

I have just finished two days for baby sitting my WHS2011 DrivePool while the same exact issue happened.

 

Logged into dashboard for something else and looked at DrivePool (Beta *.659).  Saw the Free Space at top of pie chart was ~2x more than usual.  Bar at bottom was doing a duplication pass.  Entire DrivePool went to 1x and was redoing the duplication.

 

I've since done my own home grown "audit" of files (RoboCopy -list output only to backup) and saved "Dir" command text files (before / after issue).  No files missing or issues.

 

Made sure all the HDDs were ok by forcing new Drive Scanner on all disks (thought maybe a bad file system popped a drive volume offine or something).  Nothing like that came up.

 

Really starting to look into Sonology NAS solutions.  This is the 3rd time in approximately the last 6-8 months I've had to bring things back to normal after DrivePool lost it's mind over duplications.

 

At current time (before this issues but after last issue of going to 3x duplication on everything) I've had the entire pool at 2x (pool dupilcation) and one folder set to 1x (subfolder to RecordedTV callled "unduplicated" just to have enough room).

 

I'm chiming in to let you know you are not alone with this happening.  For me I believe it was around Feb 22 the duplication went to 1x and automatically (thankfully) redid itself to 2x.

 

I have also seen odd measurement results:

"Other" file space found on drives with no "Other" data on them.  Talking about one volume with 2.9GB of "Other" that must be ADS data or possibly hidden DrivePool files.  This volume has never had non-pooled files on it (that I put there).  Thinking about removing that volume from Pool formatting the partition and re-adding it.  Just haven't taken the time to yet. 

 

I am very happy to hear the duplication, measuring, etc.. systems will get audited/looked into and am excited to hear about the progress with it.  I really like your software and the value it added to the Windows platforms.  Thank you for your continued work on it.

Link to comment
Share on other sites

  • 0

Well, there have been a number of changes to the measuring code, so it soem the issues there may be related to that. 

I've seen these myself, and I absolutely understand. So does Alex. It's one of our higher priorities, but it's not a simple thing to do. It's a LOT of work that goes into it. 

 

As for the "spontaneous unduplicating" data, you've posted in another thread about this, but I'll repeat here: It should not be happening. Period. There are numerous checks done to make sure that the duplication status is always right.  Short of read errors on the pooled disks, it shouldn't happen.

But the next time this happens, use "Process Explorer" (from SysInternals), to pause the "DrivePool.Service.exe" process, and contact us immediately (this will essentially prevent the process from duplicating or unduplicating or from being able to access the UI).  But this should help use to catch the issue as it's happening, as we've been unable to reproduce this issue at all. 

 

 

 

I am very happy to hear the duplication, measuring, etc.. systems will get audited/looked into and am excited to hear about the progress with it.  I really like your software and the value it added to the Windows platforms.  Thank you for your continued work on it.

 

Yeah, agreed, actually. I've seen the issues with the measuring myself, and hope to see this and the duplication issues fixed. Though, it's definitely going to be time consuming. 

 

And thank you for the kind words (even despite the issues). 

Link to comment
Share on other sites

  • 0

Unfortunately, not yet.  This is going to take a significant audit to the measurement code, and Alex is planning on that.  

 

So this isn't really a small, quick thing, and we're sorry that it is/will take a while to get fixed.

 

That said, Alex *has* improved the dpcmd tool, to add a bit more checking.

 

Specifically, you can run this:

dpcmd check-pool-fileparts P:\ 3 true 1

This will list all of the x1 files on the "P:\" pool.  So you can manually verify if files are duplicated or not. 

Link to comment
Share on other sites

  • 0

Sorry for the delayed response, but I haven't been feeling great lately.  

 

 

The above command will only show if it's duplicated or not. But it *will* show it accurately (more so than the UI0. 

 

As for the same partition, no it won't.  And sorry that I missed that (but likely because of said "not feeling great", but that still isn't a good excuse).

 

 

 

 

And to clarify here, unfortunately, I don't think there is going to be a fix soon.  Not just because alex has been swamped with the issue recently found with CloudDrive (google drive namely, but a bunch of other related stuff because if it as well), but also beacuse this really SHOULDN'T be happening.  And specifically, Alex already knows that he's pretty much going to have to audit the involved code line by line, to ensure that he didn't miss something, or make some very easy to overlook error, etc.  

 

And that said, this is something that has been on his mind constantly, since it's popped up, and he *really*, *really* wants to get to it. 

Link to comment
Share on other sites

  • 0

Just bumping this up to check whether there is some sort of progress? As it is, I am using 4x2TB HDDs and using Server Backup to backup three of them to be sure I have a copy of everything (this is different from before when I had 2x2TB partitions on 1 4TB HDD). But it is less efficient then it could be. Also, I am still hoping we'll get to see grouping/strings someday.

 

On a side note, I could do a little SQL (or Excel/VBA) on output of dpcmd if that output could be delimited (e.g. pipe-delimited). I realise the output may appear not structured enough (text, numbers, varying lengths etc) but that can be solved/dealt with easily (have designed and programmed solutions to such issues in the past and would be willing to provide ideas).

Link to comment
Share on other sites

  • 0

No, unforutnately, there hasn't been any progress here.  

 

Any updates will be posted in the linked issue:

https://stablebit.com/Admin/IssueAnalysis/22889

At least for the public stuff. 

 

 

The problem is that this isn't a simple quick fix, but will require a massive audit of the balancing/duplication code.  Additionally, coupled with the "Corrected" issues in the later builds, once StableBit CloudDrive is a bit more stable, Alex plans on hitting the code hard and address this and other issues. 

 

 

I know that's not a great answer, and this should be fixed sooner rather than later, it's just that it's a very technical and complicated issue. 

Link to comment
Share on other sites

  • 0

OK, so I had been on version .659 but there I had the re-duplication issue. So I reverted to the latest public release .561. I had no issues, except for a small unduplicated files which did not actually exist (based on dpcmd).

 

Today I accidently log on and see no duplication at all while the "x2" flag is listed! See attached. I have Pool File Duplication set (x2). Standard balancers, everything default, no file placement rules. The only thing that may be non-default is that I have Balance immediately.

 

I do not know how long this situation has lasted already. I do know that because of it, my Server Backups are incomplete (I backup 3 of the 4 volumes).

 

I can not find .672 or .673 betas and given that I had reduplication issues with the .659 beta I am reluctant to try TBH. (http://community.covecube.com/index.php?/topic/1627-duplication-removed/page-2&do=findComment&comment=13380)

 

I remeasured and I get the same result.

 

Given that I am at .561, I can not do the dpcmd check-pool-fileparts but I can do a dpcmd get-duplication and if I do that on a file in the pool, it says expected and found copies: 1...

 

post-1414-0-48863300-1473346685_thumb.png

Link to comment
Share on other sites

  • 0

Today I accidently log on and see no duplication at all while the "x2" flag is listed! See attached. I have Pool File Duplication set (x2). Standard balancers, everything default, no file placement rules. The only thing that may be non-default is that I have Balance immediately.

 

I do not know how long this situation has lasted already. I do know that because of it, my Server Backups are incomplete (I backup 3 of the 4 volumes).

 

 

 

What OS is this? 

 

The reason that I ask, is that the 672 build specifically addresses this sort of issue, and seems to only happen on Windows 7 or Server 2008R2 based systems. 

 

The issue in question: https://stablebit.com/Admin/IssueAnalysis/20851

 

And this sounds like EXACTLY what you're reporting.   The 672 build should address this specifically, but we've seen some related issues. 

 

And all beta builds can be found here:

http://dl.covecube.com/DrivePoolWindows/beta/download/

Link to comment
Share on other sites

  • 0

Yup, a lot of changes, and Alex is going through the outstanding bugs (so should get to the one above "very soon", depending). 

 

I'd recommend the 672 build, at the very least.  

 

Though, really, the 684 build (the latest, as of right now). 

There are a number of issues it deals with as well, and makes things easier in general. 

Link to comment
Share on other sites

  • 0

Just to let you know, Alex has been doing a lot of hammering on DrivePool lately, and has fixed a bunch of issues. 

 

One of the ones he's fixed is the volume equalization stuff, and I beleive that at least one of the issues here was related to that. 

 

Also, there has been a lot of bug squashing in the measurement code here, as well. 

 

 

If you want, try out the latest beta builds, as they should fix at least some of these issues. 

http://dl.covecube.com/DrivePoolWindows/beta/download/

 

For instance, the volume equalization: 

 

 

.704
* The file mover and the file pattern mover will now work with file allocation sizes when computing move sets.
* The volume equalization balancing algorithm was not equalizing protected files properly and sometimes it was computing incorrect
unprotected file deltas.
Link to comment
Share on other sites

  • 0

So I have been running .672 BETA for a while now and it worked until...

I got bad sectors on one HDD. I removed it from the Pool and added another. Everything has been duplicated/rebalanced. However, the dashboard reports 34.8MB as Unduplicated. I also get 3.00GB Unusable for duplication which I do not understand as the Pool is x2 and there are 4 ~2TB volumes (2 x 2TB HDDs and 2 x ~2TB partitions on seperate 4TB HDDs)

I am running WHS2011, DP 2.2.0.672 BETA, x2 Pool file duplication

All HDDs are 4K/512e AF, 4K bytes per Cluseter and 1K bytes per FileRecord Segment.

DPCMD does not report any duplication errors. Exactly x2 File parts over Files, both in number and size. That seems fine but it is a hassle and a worry that the dashboard continues to give unduplicated data.

=====

Seperately, I now have 2 HDDs in my 4HDD Pool where I use 2TB partitions on a 4TB HDD. I use 2TB partitions so Server Backup works and do not use the other 2TB partitions on the 4TB HDDs due to the issue that DP does not guarantee duplicated on different physical devices (see http://community.covecube.com/index.php?/topic/1681-unduplicated-files-with-pool-file-duplication/&do=findComment&comment=12330).DPCMD does report duplicates on the same Device but does not raise an error. I have been thinking of automating this but for that the output must be readable into Excel of a database. Would you consider including an option to output to a piped format?

All it would need is a record/line type indicator, e.g.
11 = Free text Line
21 = Directory Duplication Line
22 = File Part Line (Directory)
31 = File Duplication Line
32 = File Part Line (File)

And the file would like something like this
11|Scanning...||||
21|+|[4x/2x M]|P:\||
22|->|\Device\HarddiskVolume7\PoolPart.cebbee37-e4b5-4b4e-881d-0b8fc7a87241\|[Device 1]||
22|->|\Device\HarddiskVolume8\PoolPart.d6c8abad-68f9-4b49-ae2c-b356e5f056d6\|[Device 2]||
22|->| \Device\HarddiskVolume10\PoolPart.c497213b-b1fe-4187-963b-ff1b5ed0e730\|[Device 3]||
22|->|\Device\HarddiskVolume13\PoolPart.ac08c794-b981-42eb-b05b-3994ade7f64b\|[Device 5]||
31|-|[2x/2x]|P:\ServerFolders\A-ALL\DPCMD-V01.LOG|83.0 MB|87,012,010 B
32|->|\Device\HarddiskVolume8\PoolPart.d6c8abad-68f9-4b49-ae2c-b356e5f056d6\ServerFolders\A-ALL\DPCMD-V01.LOG|[Device 2]
32|->|\Device\HarddiskVolume13\PoolPart.ac08c794-b981-42eb-b05b-3994ade7f64b\ServerFolders\A-ALL\DPCMD-V01.LOG|[Device 5]

This could easily be read into Excel or a database for analysis and it would allow me to check (easily) whether duplicates exist on the same physical device.

 

Edit: It would also be of a great help to determine whether the upcoming String/Stripe/Group functionality (which I am very much looking forward to) works.

Link to comment
Share on other sites

  • 0

The Public beta build does have issues reporting sizes.  And the measurement data can cause issues. 

 

the latest internal betas do fix the measuring issues. 

http://dl.covecube.com/DrivePoolWindows/beta/download/StableBit.DrivePool_2.2.0.734_x64_BETA.exe

 

 

As for the command line stuff, that is definitely a possibility, but some thing that we've talked about is handling this data better, and making it more accessible.  If that is making the dpcmd stuff more robust, that's one option. But another is documenting the commands that we use so others could create utilities that were more robust. 

Link to comment
Share on other sites

  • 0

Hi. So I had "solved" this by only using one 2TB partiiton of each 4TB HDD and backing up all HDDs in the Pool bar 1. However, over time this became very cumbersome and I recently split of Client Computer Backups onto a seperate Pool consisting of 2 x 4TB HDDs formatted as 4 x 2TB partitions. I am running WHS2011 and DP 2.2.0.738 with x2 Pool File Duplication.

 

Initially, files were distributed correctly, i.e., each duplicate was placed on the other actual HDD. But now some duplicates are stored on two partitions on the same HDD. All settings are default (no file placement rules, balances are Scarr, Volume Equalisation, Drive Usage Limiter, Prvent Drive Overfill and Duplication Space Optimiser).

 

What I have noticed is that DP seeks to equalise either the amount of date stored or the amount of data free on each partition (which in this case is the same). Two partitions (G:\ and H:\) on Device 5 have a SIV folder with quite a bit of data (as a result of Server Backup running on these I guess).

 

I am just wondering whether one of the Balancers, I suspect the Duplication Space Optimizer, is not following the "no duplicates on 1 physical HDD"-rule? Perhaps this makes it easier to solve it and needs no/less in-depth code audit?

 

I would be willing to simply turn the Duplication Space Optimizer off but I doubt whether it will help. DPCMD does list the duplicates as being stored on the same physical HDD but does not indicate that this is an error/problem so I am not sure DP even realises this is one (just did and did remeasure and rebalance, no effect).

 

Another thing I notice that is weird is that in the Drive Usage Limiter, volumes G:\ and H:\ (together Device 5) are listed seperately (they have their own Duplicated/Unduplicated check boxes) whereas volumes I:\ and J:\ (together Device 3) hare combined and have one check box. It is Device 3, containing I:\ and J:\ that has both duplicates stored on it.

Link to comment
Share on other sites

  • 0

Alex does plan on performance a deep audit of the code to identify/fix this issue.

That said, this may be related to the beta (2.2 in general, including the public beta), as that were a number of issues introduced by the measuring code. This would absolutely affect the balancing and duplication, as it was not properly identifying duplicated data.  

Though, Alex is still planning on auditing the code. 

 

 

As for the balancing, yeah the "volume optimization" and "duplication space optimization" balancers are what would affect this, and both are built in.  Also, the kernel driver itself tries to check this, so that files are not placed on the same physical disks. 

 

 

 

However, if the G: and H: volumes are being listed separately, this may be part of the issue, specifically the cause of it. 

Link to comment
Share on other sites

  • 0

I doubt it.

 

The pool consists of:

Device 5: G:\ and H:\ (and recognised as such by Scanner)

Device 3: I:\ and j:\ (and recognised as such by Scanner)

It is Device 3 that has both duplicates of some files (and this is reported by DPCMD as well without it being flagged as an error). Drive Usage LImiter lists both Device 3 partitions as one / provides one set off checkboxes (Duplicated and Unduplicated). It is Device 5 for which the Drive Usage Limiter provides checkboxes for both partitions.

 

What is the order by which DPCMD is supposed to list files? It appears to be alphabetically after the PoolPart folder mostly but at least some of the erroneously placed files are listed out of that alphabetical order.

 

Should I revert to a 2.1 version? That one has measurement issues and, if I am not mistaken, does not support DPCMD list-pool-file-parts or somesuch..?

Link to comment
Share on other sites

  • 0

For the measuring issues, 2.1.1.561, or the latest internal betas (2.2.0.740).  The latest beta's have returned to the older method of measuring, which while isn't as accurate in some cases, is MUCH more reliable. 

 

And yeah, the 2.1.1.561 version doesn't include the "check-pool-fileparts" option for DPCMD. 

 

 

If you're still seeing this issue on the latest version, then yeah, it's definitely not been fixed.  And in this case, it may be helpful to run the StableBit Troubleshooter, as that does collect a bunch of information that may be helpful. 

http://dl.covecube.com/Troubleshooter/StableBit.Troubleshooter.exe

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...