Ok, so reparse points have definitely been driving me nuts lately. I was planning on releasing the StableBit DrivePool 2.1 BETA about a week after the 2.0 final with reparse point support, but it's still not out and it's because of reparse points. I've been trying different architectures over the past few weeks and none of them panned out as expected.
But today, I believe that I've finally got something that will work. It will support all of the various kinds of reparse points, it will be super fast and stable.
So what are these "reparse points" you may be asking?
Well, reparse points are the underlying file system technology that enable a whole suite of functionality. Mainly they are used for creating symbolic links, junctions and mount points.
Essentially it's a way to redirect access from one file or folder to another. You may be wondering if I'm talking about a Shortcut? No, confusingly a shortcut is not a reparse point.
So how many ways does Windows have to redirect files / folders?
A lot. That's the problem!
Here they are off the top of my head:
- A shortcut - A special file that is parsed by the Explorer shell that really links to another file somewhere else (as in a Start menu shortcut).
Most people are probably familiar with this because it's readily available in the Explorer UI.
- Symbolic file link - A file that points to some other file somewhere else. Confusingly, Windows Explorer also calls these "shortcuts" in the file properties dialog.
A symbolic link can be created by the mklink utility with no options.
- Symbolic directory link - These are relatively new, as they were introduced in Windows Vista. This is essentially a directory that points to another directory somewhere else.
These can be created by using mklink /D.
- Directory junction point - These are very similar to "symbolic directory links", but they were available prior to Windows Vista. Again, it is essentially a directory that points to another directory somewhere else. Some people make the mistake that a junction is only capable of pointing to another directory on the same volume, and that's not the case.
These can be created by using mklink /J.
- Mount point - Mount points allow you to designate a directory that will point to another volume. These are typically used to "mount" a number of drives as directories under some other drive letter, thus saving drive letters.
These can be created from Disk Management.
- File hard link - Yet another way to make a file point to another file. However, this method can only be used to point a file to some other file on the same volume.
These are created using mklink /H.
Yes, that's a lot of ways that you can redirect files / folders in Windows. Try Googling these and you can see the confusion that ensues as to what the differences are between each.
So what is the difference between all of these?
Well, instead of pointing out the pros and cons, I'll tell you how each one of them works under the hood and you can decide for yourself:
- A shortcut - This is the most "user friendly" way of creating a file that points to another one. Even the name makes sense, "shortcut", imagine that. It's readily available from the Windows Explorer menus and works entirely in user mode. A special .lnk file is created that the user mode shell knows how to parse. In Windows Explorer, an icon with a little arrow is shown to you to let you know that this is really a shortcut.
However, as far as the kernel and file system are concerned, there is nothing special about the .lnk file, it's just a regular file.
- Symbolic file link - Sometimes called a "symlink" or "soft link", this is a system that redirects one file to another, purely in the kernel. It involves some special metadata that is stored with the "source link" file that points to the "target destination file" and requires coordination between the file system and the Windows I/O Manager.
This system uses what are called "reparse points".
- Symbolic directory link - This is exactly the same thing as a symbolic file link, but it works on directories. The reason why I separated the two is because symbolic directory links were not available prior to Windows Vista and they must be created differently.
However, the underlying technology that enables this is exactly the same. This too uses "reparse points".
- Directory junction point - This is similar to a Symbolic directory link except that it is available prior to Windows Vista and uses an older technique. Technically speaking, the main difference between this and symbolic directory links is that directory junction points always point to an absolute path, while symbolic directory links can point to relative or absolute paths.
Surprisingly, this too uses "reparse points", but not all reparse points are the same. I'll get to that soon.
- Mount point - These are implemented in the exact same way as directory junction points, except that they point to the root of some other volume instead of some other directory.
These are implemented with the exact same "reparse points" as directory junctions.
- File hard link - This is purely a file system construct. Because of the way directory indexes work in NTFS, it is possible to add a file entry to a directory index of a file that already exists under some other directory. Essentially, you can think of the file as being in 2 (or more) places at once. While this is not quantum physics, it is NTFS. Each file has a "reference count" and that count is incremented whenever a hard link is created to it. When the count reaches 0, the file is deleted.
No other kernel subsystem is involved and no "reparse points" exists. This is the cleanest and purest way of making a file appear in 2 places at once (IMO).
Wow, and all this works together reliably?
Yes, and that's what StableBit DrivePool is trying to preserve. You see, right now the only thing that we support on the pool from the above list are shortcuts. Everything else is not supported.
Some people have been requesting the support of file / directory symbolic links and junctions. Those 2 can be used by software in order to create complex directory structures, in order to organize your data better.
4 out of the 5 unsupported technologies use "reparse points", so it makes sense for StableBit DrivePool to implement support for them.
Ok, so what's a "reparse point"?
A reparse point is a Microsoft defined data structure that gets associated with a file or a directory. When that file or directory has a reparse point associated with it, then it becomes a kind of link to "somewhere else".
Essentially, when a file system encounters a reparse point, it tells the I/O Manager "these aren't the droids you're looking for, go look here". The I/O Manager is responsible for opening files, so it happily obliges.
That doesn't sound too complicated
Well, it isn't, except that there are different types of "reparse points" and each reparse point has a different meaning of where to go next.
- File / directory symbolic links use a "symlink" reparse point.
- Directory junction points / mount points use a "mount point" reparse point.
- Any 3rd party developers can develop their own type of reparse points and their own logic as to how they work. Remember drive extender from WHS v1? Yep, those tombstones were yet another kind of reparse points.
Ok, so this is complicated. But will StableBit DrivePool support reparse points?
I'm working hard towards that goal, and the reason why I'm writing this is because I believe that I've finally cracked the architecture that we need to support all Microsoft and 3rd party reparse points on the pool.
The architecture has these positive aspects to it:
- It supports file / directory symbolic links, directory junction points, mount points, and 3rd party reparse points on the pool.
- It is a 100% native kernel implementation, with no dependence on the user mode service.
- It follows the 0 local metadata approach of storing everything needed on the pool itself and does not rely on something like the registry. This means that your reparse points will work when moving pools between machines (provided that you didn't link to something off of the pool that no longer exists on the new machine).
Some of my previous attempts had these limitations:
- Requires the user mode service to run additional periodic maintenance tasks on the pool.
- No support for directory reparse points, only file ones.
- Adding a drive to the pool would require a somewhat lengthy reparse point pass.
The new architecture that I came up with has none of these limitations. All it requires is NTFS and Windows.
When will it be ready?
I'd hate to predict, but I think that it should be deployed in BETA form in a few weeks.