Since this has been a point of discussion on the old forum, I thought that I'd start this forum category by posting about per-folder file duplication in StableBit DrivePool.
Unlike the blog posts, I'll try to keep this brief (and somewhat technical).
What is file duplication?
Simply put, file duplication protects your files by storing 2 copies of a file on 2 separate hard drives. In case one fails, the other will have a copy of all of your duplicated files.
Designing File Duplication
The #1 priority for file duplication was to make the technology behind it as simple as possible, thus avoiding any unnecessary complications (and bugs). The first approach that DrivePool took was to put the duplication count in the folder name itself (you can't get any simpler than that).
For example, "Pictures.2" would duplicate all of your pictures to 2 hard drives.
This was very straightforward but unfortunately didn't work very well with shared folders. The name of a shared folder (as seen on the network) is typically the name of the folder itself, so it doesn't make sense to include the duplication count in the shared folder name. And more importantly, WHS 2011 didn't work well with this scheme.
(DrivePol 1.0 BETA M3 did try to work around the issues with folder links, but that was eventually replaced with a better and simpler system).
Alternate Data Streams
DrivePool 1.0 final shipped with the ability to store "tags" on folders. Although the tags are nothing more than alternate data streams on directories, I still like the word "tags" to describe the approach, because these "tags" describe something about a directory.
One of these tags eventually became a "DuplicationCount".
At first, the idea was to store the actual number of copies in the tag. So if a folder is designated as duplicated it would contain "2" in the duplication tag. But because we needed to enable folder duplication at any level in a directory tree, it was necessary to implement something that's a bit more flexible.
The current system supports an "Inherit" and a "Multiple" flag in addition to an explicit duplication count, and supports setting a duplication count on any arbitrary folder on the pool.
The new tag based system is not without complications.
We have issues with the read-only attribute on directories (which came up recently). And what happens if you move a duplicated folder to a location that's not duplicated? Well, we're handling all of these cases for you in a (hopefully) intelligent manner.
I've considered scrapping per-folder duplication in DrivePool 2.0. The reason is because you can create duplicated pools and non duplicated pools, and I feel that this is sufficient flexibility for most people. If we got rid of per-folder duplication it would make a lot of thing much simpler (such as balancing).
What do you think of per-folder duplication?
Let me know. I'm listening.