Jump to content
  • 3

Beware of DrivePool corruption / data leakage / file deletion / performance degradation scenarios Windows 10/11


MitchC

Question

To start, while new to DrivePool I love its potential I own multiple licenses and their full suite.  If you only use drivepool for basic file archiving of large files with simple applications accessing them for periodic reads it is probably uncommon you would hit these bugs.  This assumes you don't use any file synchronization / backup solutions.  Further, I don't know how many thousands (tens or hundreds?) of DrivePool users there are, but clearly many are not hitting these bugs or recognizing they are hitting these bugs, so this IT NOT some new destructive my files are 100% going to die issue.  Some of the reports I have seen on the forums though may be actually issues due to these things without it being recognized as such. As far as I know previously CoveCube was not aware of these issues, so tickets may not have even considered this possibility.

I started reporting these bugs to StableBit ~9 months ago, and informed I would be putting this post together ~1 month ago.  Please see the disclaimer below as well, as some of this is based on observations over known facts.

You are most likely to run into these bugs with applications that: *) Synchronize or backup files, including cloud mounted drives like onedrive or dropbox *) Applications that must handle large quantities of files or monitor them for changes like coding applications (Visual Studio/ VSCode)


Still, these bugs can cause silent file corruption, file misplacement, deleted files, performance degradation, data leakage ( a file shared with someone externally could have its contents overwritten by any sensitive file on your computer), missed file changes, and potential other issues for a small portion of users (I have had nearly all these things occur).  It may also trigger some BSOD crashes, I had one such crash that is likely related.  Due to the subtle nature some of these bugs can present with, it may be hard to notice they are happening even if they are.  In addition, these issues can occur even without file mirroring and files pinned to a specific drive.  I do have some potential workarounds/suggestions at the bottom.

More details are at the bottom but the important bug facts upfront:

  • Windows has a native file changed notification API using overlapped IO calls.  This allows an application to listen for changes on a folder, or a folder and sub folders, without having to constantly check every file to see if it changed.  Stablebit triggers "file changed" notifications even when files are just accessed (read) in certain ways.  Stablebit does NOT generate notification events on the parent folder when a file under it changes (Windows does).  Stablebit does NOT generate a notification event only when a FileID changes (next bug talks about FileIDs).

 

  • Windows, like linux, has a unique ID number for each file written on the hard drive.  If there are hardlinks to the same file, it has the same unique ID (so one File ID may have multiple paths associated with it). In linux this is called the inode number, Windows calls it the FileID.  Rather than accessing a file by its path, you can open a file by its FileID.  In addition it is impossible for two files to share the same FileID, it is a 128 bit number persistent across reboots (128 bits means the number of unique numbers represented is 39 digits long, or has the uniqueness of something like the MD5 hash).  A FileID does not change when a file moves or is modified.  Stablebit, by default, supports FileIDs however they seem to be ephemeral, they do not seem to survive across reboots or file moves.  Keep in mind FileIDs are used for directories as well, it is not just files. Further, if a directory is moved/renamed not only does its FileID change but every file under it changes. I am not sure if there are other situations in which they may change.  In addition, if a descendant file/directory FileID changes due to something like a directory rename Stablebit does NOT generate a notification event that it has changed (the application gets the directory event notification but nothing on the children).


There are some other things to consider as well, DrivePool does not implement the standard windows USN Journal (a system of tracking file changes on a drive).  It specifically identifies itself as not supporting this so applications shouldn't be trying to use it with a drivepool drive. That does mean that applications that traditionally don't use the file change notification API or the FileIDs may fall back to a combination of those to accomplish what they would otherwise use the USN Journal for (and this can exacerbate the problem).  The same is true of Volume Shadow Copy (VSS) where applications that might traditionally use this cannot (and drivepool identifies it cannot do VSS) so may resort to methods below that they do not traditionally use.


Now the effects of the above bugs may not be completely apparent:

  • For the overlapped IO / File change notification 

This means an application monitoring for changes on a DrivePool folder or sub-folder will get erroneous notifications files changed when anything even accesses them. Just opening something like file explorer on a folder, or even switching between applications can cause file accesses that trigger the notification. If an application takes actions on a notification and then checks the file at the end of the notification this in itself may cause another notification.  Applications that rely on getting a folder changed notification when a child changes will not get these at all with DrivePool.  If it isn't monitoring children at all just the folder, this means no notifications could be generated (vs just the child) so it could miss changes.

  • For FileIDs

It depends what the application uses the FileID for but it may assume the FileID should stay the same when a file moves, as it doesn't with DrivePool this might mean it reads or backs up, or syncs the entire file again if it is moved (perf issue).  An application that uses the Windows API to open a File by its ID may not get the file it is expecting or the file that was simply moved will throw an error when opened by its old FileID as drivepool has changed the ID.   For an example lets say an application caches that the FileID for ImportantDoc1.docx is 12345 but then 12345 refers to ImportantDoc2.docx due to a restart.  If this application is a file sync application and ImportantDoc1.docx is changed remotely when it goes to write those remote changes to the local file if it uses the OpenFileById method to do so it will actually override ImportantDoc2.docx with those changes.

I didn't spend the time to read Windows file system requirements to know when Windows expects a FileID to potentially change (or not change).  It is important to note that even if theoretical changes/reuse are allowed if they are not common place (because windows uses essentially a number like an md5 hash in terms of repeats) applications may just assume it doesn't happen even if it is technically allowed to do so.  A backup of file sync program might assume that a file with specific FileID is always the same file, if FileID 12345 is c:\MyDocuments\ImportantDoc1.docx one day and then c:\MyDocuments\ImportantDoc2.docx another it may mistake document 2 for document 1, overriding important data or restore data to the wrong place.  If it is trying to create a whole drive backup it may assume it has already backed up c:\MyDocuments\ImportantDoc2.docx if it now has the same File ID as ImportantDoc1.docx by the time it reaches it (at which point DrivePool would have a different FileID for Document1).


Why might applications use FileIDs or file change notifiers? It may not seem intuitive why applications would use these but a few major reasons are: *) Performance, file change notifiers are a event/push based system so the application is told when something changes, the common alternative is a poll based system where an application must scan all the files looking for changes (and may try to rely on file timestamps or even hashing the entire file to determine this) this causes a good bit more overhead / slowdown.  *)  FileID's are nice because they already handle hardlink file de-duplication (Windows may have multiple copies of a file on a drive for various reasons, but if you backup based on FileID you backup that file once rather than multiple times.  FileIDs are also great for handling renames.  Lets say you are an application that syncs files and the user backs up c:\temp\mydir with 1000 files under it.  If they rename c:\temp\mydir to c:\temp\mydir2 an application use FileIDS can say, wait that folder is the same it was just renamed. OK rename that folder in our remote version too.  This is a very minimal operation on both ends.  With DrivePool however the FileID changes for the directory and all sub-files.  If the sync application uses this to determine changes it now uploads all these files to the system using a good bit more resources locally and remotely.  If the application also uses versioning this may be far more likely to cause a conflict with two or more clients syncing, as mass amounts of files are seemingly being changed.

Finally, even if an application is trying to monitor for FileIDs changing using the file change API, due to notification bugs above it may not get any notifications when child FileIDs change so it might assume it has not.


Real Examples
OneDrive
This started with massive onedrive failures.  I would find onedrive was re-uploading hundreds of gigabytes of images an videos multiple times a week.  These were not changing or moving.  I don't know if the issue is onedrive uses FileIDs to determine if a file is already uploaded, or if it is because when it scanned a directory it may have triggered a notification that all the files in that directory changed and based on that notification it reuploads.  After this I noticed files were becoming deleted both locally and in the cloud.  I don't know what caused this, it might have been because the old file it thought was deleted as the FileID was gone and while there was a new file (actually the same file) in its place there may have been some odd race condition.   It is also possible that it queued the file for upload, the FileID changed and when it went to open it to upload it found it was 'deleted' as the FileID no longer pointed to a file and queued the delete operation.   I also found that files that were uploaded into the cloud in one folder were sometimes downloading to an alternate folder locally.  I am guessing this is because the folder FileID changed.  It thought the 2023 folder was with ID XYZ but that now pointed to a different folder and so it put the file in the wrong place.  The final form of corruption was finding the data from one photo or video actually in a file with a completely different name.  This is almost guaranteed to be due to the FileID bugs.  This is highly destructive as backups make this far harder to correct.  With one files contents replaced with another you need to know when the good content existed and in what files were effected.  Depending on retention policies the file contents that replaced it may override the good backups before you notice.  I also had a BSOD with onedrive where it was trying to set attributes on a file and the CoveFS driver corrupted some memory.  It is possible this was a race condition as onedrive may have been doing hundreds of files very rapidly due to the bugs.  I have not captured a second BSOD due to it, but also stopped using onedrive on DrivePool due to the corruption.   Another example of this is data leakage.  Lets say you share your favorite article on kittens with a group of people.   Onedrive, believing that file has changed, goes to open it using the FileID however that file ID could essentially now correspond to any file on your computer now the contents of some sensitive file are put in the place of that kitten file, and everyone you shared it with can access it.

Visual Studio Failures
Visual studio is a code editor/compiler.  There are three distinct bugs that happen.  First, when compiling if you touched one file in a folder it seemed to recompile the entire folder, this due likely to the notification bug.  This is just a slow down, but an annoying one.  Second, Visual Studio has compiler generated code support.  This means the compiler will generate actual source code that lives next to your own source code.   Normally once compiled it doesn't regenerate and compile this source unless it must change but due to the notification bugs it regenerates this code constantly and if there is an error in other code it causes an error there causing several other invalid errors.  When debugging visual studio by default will only use symbols (debug location data) as the notifications from DrivePool happen on certain file accesses visual studio constantly thinks the source has changed since it was compiled and you will only be able to breakpoint inside source if you disable the exact symbol match default.  If you have multiple projects in a solution with one dependent on another it will often rebuild other project deps even when they haven't changed, for large solutions that can be crippling (perf issue).  Finally I often had intellisense errors showing up even though no errors during compiling, and worse intellisense would completely break at points.  All due to DrivePool.


Technical details / full background & disclaimer

I have sample code and logs to document these issues in greater detail if anyone wants to replicate it themselves.

It is important for me to state drivepool is closed source and I don't have the technical details of how it works.  I also don't have the technical details on how applications like onedrive or visual studio work.  So some of these things may be guesses as to why the applications fail/etc.

The facts stated are true (to the best of my knowledge) 


Shortly before my trial expired in October of last year I discovered some odd behavior.  I had a technical ticket filed within a week and within a month had traced down at least one of the bugs.  The issue can be seen https://stablebit.com/Admin/IssueAnalysis/28720 , it does show priority 2/important which I would assume is the second highest (probably critical or similar above).  It is great it has priority but as we are over 6 months since filed without updates I figured warning others about the potential corruption was important.  


The FileSystemWatcher API is implemented in windows using async overlapped IO the exact code can be seen: https://github.com/dotnet/runtime/blob/57bfe474518ab5b7cfe6bf7424a79ce3af9d6657/src/libraries/System.IO.FileSystem.Watcher/src/System/IO/FileSystemWatcher.Win32.cs#L32-L66

That corresponds to this kernel api:
https://learn.microsoft.com/en-us/windows/win32/fileio/synchronous-and-asynchronous-i-o

Newer api calls use GetFileInformationByHandleEx to get the FileID but with older stats calls represented by nFileIndexHigh/nFileIndexLow.  


In terms of the FileID bug I wouldn't normally have even thought about it but the advanced config (https://wiki.covecube.com/StableBit_DrivePool_2.x_Advanced_Settings) mentions this under CoveFs_OpenByFileId  "When enabled, the pool will keep track of every file ID that it gives out in pageable memory (memory that is saved to disk and loaded as necessary).".   Keeping track of files in memory is certainly very different from Windows so I thought this may be the source of issue.  I also don't know if there are caps on the maximum number of files it will track as if it resets FileIDs in situations other than reboots that could be much worse. Turning this off will atleast break nfs servers as it mentions it right in the docs "required by the NFS server".

Finally, the FileID numbers given out by DrivePool are incremental and very low.  This means when they do reset you almost certainly will get collisions with former numbers.   What is not clear is if there is the chance of potential FileID corruption issues.  If when it is assigning these ids in a multi-threaded scenario with many different files at the same time could this system fail? I have seen no proof this happens, but when incremental ids are assigned like this for mass quantities of potential files it has a higher chance of occurring.

Microsoft mentions this about deleting the USN Journal: "Deleting the change journal impacts the File Replication Service (FRS) and the Indexing Service, because it requires these services to perform a complete (and time-consuming) scan of the volume. This in turn negatively impacts FRS SYSVOL replication and replication between DFS link alternates while the volume is being rescanned.".  Now DrivePool never has the USN journal supported so it isn't exactly the same thing, but it is clear that several core Windows services do use it for normal operations I do not know what backups they use when it is unavailable. 


Potential Fixes
There are advanced settings for drivepool https://wiki.covecube.com/StableBit_DrivePool_2.x_Advanced_Settings beware these changes may break other things.
CoveFs_OpenByFileId - Set to false, by default it is true.  This will disable the OpenByFileID API.  It is clear several applications use this API.  In addition, while DrivePool may disable that function with this setting it doesn't disable FileID's themselves.  Any application using FileIDs as static identifiers for files may still run into problems. 

I would avoid any file backup/synchronization tools and DrivePool drives (if possible).  These likely have the highest chance of lost files, misplaced files, file content being mixed up, and excess resource usage.   If not avoiding consider taking file hashes for the entire drivepool directory tree.  Do this again at a later point and make sure files that shouldn't have changed still have the same hash.

If you have files that rarely change after being created then hashing each file at some point after creation and alerting if that file disappears or hash changes would easily act as an early warning to a bug here being hit.

Link to comment
Share on other sites

Recommended Posts

  • 1

MitchC, first of all thankyou for posting this! My (early a.m.) thoughts:

  • (summarised) "DrivePool does not properly notify the Windows FileSystem Watcher API of changes to files and folders in a Pool."

If so, this is certainly a bug that needs fixing. Indicating "I changed a file" when what actually happened was "I read a file" could be bad or even crippling for any cohabiting software that needs to respond to changes (as per your example of Visual Studio), as could neglecting to say "this folder changed" when a file/folder inside it is changed.

  • (summarised) "DrivePool isn't keeping FileID identifiers persistent across reboots, moves or renames."

Huh. Confirmed, and as I understand it the latter two should be persistent @Christopher (Drashna)? However, attaining persistence across reboots might be tricky given a FileID is only intended to be unique within a volume while a DrivePool file can at any time exist across multiple volumes due to duplication and move between volumes due to balancing and drive replacement. Furthermore as Microsoft itself states "File IDs are not guaranteed to be unique over time, because file systems are free to reuse them". I.e. software should not be relying solely on these over time, especially not backup/sync software! If OneDrive is actually relying on it so much that files are disappearing or swapping content then that would seem to be an own-goal by Microsoft. Digging further, it also appears that FileID identifiers (at least for NTFS) are not actually guaranteed to be collision-free (it's just astronomically improbable in the new 64+64bit format as opposed to the old but apparently still in use 16+48bit format).

  • (quote) "the FileID numbers given out by DrivePool are incremental and very low.  This means when they do reset you almost certainly will get collisions with former numbers."

Ouch. That's a good point. Any suggestions for mitigation until a permanent solution can be found? Perhaps initialising DrivePool's FileID counter using the system clock instead of initialising it to zero, e.g. at 100ns increments (FILETIME) even only an hour's uptime could give us a collision gap of roughly thirty-six billion?

  • (quote) "I would avoid any file backup/synchronization tools and DrivePool drives (if possible)."

I disagree; rather, I would opine that any backup/synchronization tool that relies solely on FileID for comparisons should be discarded (if possible); a metric that's not reliable over time should ipso facto not be trusted by software that needs to be reliable over time. EDIT 2024-10-22: However, as MitchC has pointed out, determining whether your tools are using FileID can be difficult and the risk of finding out the hard way is substantial.

Incidentally, on the subject of file hashing I recommend ensuring Manage Pool -> Performance -> Read striping is un-ticked as I've found intermittent hashing errors in a few (not all) third-party tools when this is enabled; I don't know why this happens (maybe low-level disk calls that aren't compatible with non-physical volumes?) but disabling read-striping removes the problem and I've found the performance hit is minor.

Edited by Shane
Link to comment
Share on other sites

  • 1
Quote

Furthermore as Microsoft itself states "File IDs are not guaranteed to be unique over time, because file systems are free to reuse them". 

So this is correct, as the documentation you linked to states.  One item I mentioned though, is the fact that even if it can  be re-used if in practice it isn't software may make the wrong assumption that it won't.  Not good on that software but it may be a practical exception that one might try to meet.  Further, that documentation also states:

"In the NTFS file system, a file keeps the same file ID until it is deleted. "

As DrivePool identifies itself as NTFS it is breaking that expectation.

Quote

Any suggestions for mitigation until a permanent solution can be found?

I am not sure how well things work if you just disable File IDs, maybe software will fallback to a more safe behavior (even if less performant).    In addition, I think the biggest issue is silent file corruption.  I think that can only happen due to File ID collisions (rather than just the FIle ID changing).   It is a 128 bit number, GUID's are 128 bits.  Just randomize the sucker the first time you assign a file ID (rather than using the incremental behavior currently).  Aside from it being more thread safe as you don't have a single locked increment counter it is highly unlikely you would hit a collision.  Could you run into a duplicate ? sure.  Likely? Probably not.   Maybe over many reboots (or whatever resets the ID's in drivepool beside that) but as long as whatever app that uses the FileID has detected it is gone before it is reused it eventually colliding would likely not have much effect.   Not perfect but probably an easier solution.  Granted apps like onedrive may still think all the files are deleted and re-upload them if the FileID's change (although that may be more likely due to the notification bug).

Quote

I disagree; rather, I would opine that any backup/synchronization tool that relies solely on FileID for comparisons should be discarded (if possible); 

Sure.  Except one doesn't always know how tools work.  I am only making a highly educated guess this is what OneDrive is using, but only made this after significant file corruption and research.  One would hope you don't need to have corruption before figuring out the tool you are using uses the FileID.    In addition, FileID may not be the primary item a backup/sync tool uses but something like USF may be a much more common first choice.  It may only fall back to other options when that is not available.

Is it possible the 5-6 apps I have found that run into issues are the only ones out there that uses these things? Sure.  I just would guess I am not that lucky so there are likely many more that use these features.

 

I did see either you (or someone else) who posted about the file hashing issue with the read striping.  It is a big shame, reporting data corruption (invalid hash values or rather returning the wrong read data which is what would lead to that) is another fairly massive problem.    Marking good data bad because of an inconsistent read can lead to someone thinking they lost data and trashing it, or restoring an older version that may cause newer data to be lost in an attempt to fix.  I would look into a more consistent read striping repro test but at the end of the day these other things stop me from being able to use drivepool for most things I would like to.

Link to comment
Share on other sites

  • 1
4 hours ago, Shane said:

If so, this is certainly a bug that needs fixing

Sorry, should also mention this is confirmed by StableBit and can be easily reproduced.   The attached powershell script is a basic example of the file monitoring api.  Run it by "monitor.ps1 my_folder"  where my folder is what you want to monitor.  Have a file say hello.txt inside.   Open that file in notepad.     It should instantly generate a monitoring file change event.  Further tab away from notepad and tab back to it, you will again get a changed event for that file.  Run the same thing on a true NTFS system and it will not do the same.

You can also reproduce the lack of notifications for other events by changing the IncludeSubdirectories variable in it and doing some of the tests I mention above.

watcher.ps1

Link to comment
Share on other sites

  • 1
Quote

It depends what the application uses the FileID for but it may assume the FileID should stay the same when a file moves, as it doesn't with DrivePool this might mean it reads or backs up, or syncs the entire file again if it is moved (perf issue).

Very well explained Mitch. I just discovered this issue as well while trying to work out why my installation of Freefilesync wasn't behaving as expected.

Drivepool indeed changes fileid every time a file is renamed or moved which is not correct NTFS behaviour.

The result is that if i move say 100GB of data on my drivepool from one folder to another (or rename a large group of files) when I run freefilesync for backup instead of mirroring the file moves or renames, it needs to delete and recopy every moved or renamed file. Over the network this can take hours instead of less than a second so the impact is substantial.

Link to comment
Share on other sites

  • 1

Thanks for the reply Chris.

Note: the beta does not fix the FileID change on rename or copy issue.

I have posted your comment on the Freefilesync forums and will see if Object-ID is an option for consideration there.

Meanwhile I'd still think that it would be better if file-id behaved more like regular ntfs volumes and stayed persistent.

From the same document you referenced....

Quote

In the NTFS file system, a file keeps the same file ID until it is deleted. You can replace one file with another file without changing the file ID by using the ReplaceFile function. However, the file ID of the replacement file, not the replaced file, is retained as the file ID of the resulting file.

It mentions that with the FAT file system it is not safe to assume file-id will not change over time, but with NTFS it appears the file-id is indeed persistent for the life of the file.

Link to comment
Share on other sites

  • 1

 

Quote

 

Zenju12 Dec 2023, 14:33

Let's see...
File object IDs are supported only on NTFS volumes

But for NTFS we have:
In the NTFS file system, a file keeps the same file ID until it is deleted

Ergo:
Object ID = useless
File ID = perfect as persistent file identifier (on NTFS).

 

Response from Freefilesync developer. 
 
I read through the Microsoft docs you posted earlier and others, and I agree with the Freefilesync developer. 
It appears the best way to track all files on a volume on NTFS is to use fileid which is expected to stay persistent. This requires no extra overhead or work as the Filesystem maintains FileID’s automatically. 
ObjectID requires extra overhead and is only really intended to track special files like shortcuts for link tracking etc.
 
Any software that is emulating an NTFS system should therefore provide FileID’s and guarantee they stay persistent with a file on that volume.
 
I am seeing the direct performance impact from this and agree with Mitch that there can be other adverse side affects potentially much worse than just performance issues if someone uses software that expects FileID’s to behave as per Microsoft’s documentation. 
 
Finally also note that ObjectID is not supported by the Refs filesystem, whereas FileID is.
ReFS doesn't support object IDs. ReFS uses 128-bit file IDs, so can't cleanly distinguish between file ID versus object ID when processing an open by ID. 

 

Link to comment
Share on other sites

  • 1

FWIW, digging through Microsoft's documentation, I found these two entries in the file system protocols specification:

https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/2d3333fe-fc98-4a6f-98a2-4bb805aff407

https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/98860416-1caf-4c80-a9ab-8d61e1ccf5a5

In short, if a file system cannot provide a file ID that is both unique within a given volume and stable until deleted, then it must set the field to either zero (indicating the file system does not support file IDs) or maxint (indicating the file system cannot given a particular file a unique ID) as per the specification.

Link to comment
Share on other sites

  • 1

Great finds by @Shane one more fantastic one: 
https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/d4bc551b-7aaf-4b4f-ba0e-3a75e7c528f0#Appendix_A_10

Table of the 7 file systems normally considered all support File IDs however per: https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/ns-ntifs-_file_objectid_information

"File object IDs are supported only on NTFS volumes".

In terms of the beta from the vague description it seems there was another problem were opening a File even with the current correct File ID could lead to an invalid memory access error, which may be what caused my BSODs related to this.  It does not sound like they fixed any of the items mentioned here with File IDs or notifications.


Thanks Christopher for posting here, I would have certainly missed it.   This really has all sorts of implications including data leakage where a file someone is given access to is overwritten by the contents of another file they should not have access to when the application used FileIDs.


In terms of Christopher's first post I think the quoting you did got a bit screwed up including quotes and responses:

I believe your first point is apps are wrong to rely on File IDs Microsoft says they may change per the documentation at https://learn.microsoft.com/en-us/windows/win32/api/fileapi/ns-fileapi-by_handle_file_information.  As I mention above, this is correct it states that, it then goes on to state "In the NTFS file system, a file keeps the same file ID until it is deleted.".   DrivePool emulates NTFS, therefore this being true is not only likely an assumed fact by developers but a reasonable assumption given microsoft states exactly that.   If DrivePool was not identifying as NTFS it might be a different story ( but still NTFS is the de-facto in windows so developers may just assume it true across all, even if incorrect).

There are 3 file identifier methods commonly used I believe:
1) the GetFileInformationByHandle: nFileIndexLow/nFileIndexHigh
2) the newer GetFileInformationByHandleEx: with the FileIdInfo flag returning the FILE_ID_INFO struct
3) The object ID methods like what you mentioned via ZwQueryDirectoryFile/DeviceIoControl


We have discussed one (per above).
The second:
https://learn.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_id_info  which states "The file identifier and the volume serial number uniquely identify a file on a single computer" although in reality it actually uses the same info from #1 just composited together into a single FILE_ID_128 field.  This can be compueted from nFileIndexLow/nFileIndexHigh for normal NTFS systems.

Finally, object ID.  This is a different identifier the problem is neither File ID or Object ID are guaranteed to be supported by a file system.   In NTFS both are possible as windows guesses if the number is bigger than 64 bits its an object ID and not a file ID.  In ReFS object ID's do not exist and are not supported but it does support File IDs.   Fat32 also doesn't support Object IDs. 


The other issue with Object IDs? They are not really designed to be primary application APIs.  It is queryable via two primary methods I believe: ZwQueryDirectoryFile a literal kernel call and the DeviceIoControl you linked to.

https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/ns-ntifs-_file_objectid_information

DeviceIoControl overall is "Sends a control code directly to a specified device driver, causing the corresponding device to perform the corresponding operation."  A user level application generally should not need to be interacting directly with a device driver.

File ID's can be operated on with file handles,  Object IDs require you having the device handle and going that route.

In short:

*) Per Shane's great finds all file systems should make it unique and stable
*) Specifications for NTFS identifying file systems is that the File ID is stable, unique and does not change
*) Object IDs is not supported on multiple file systems (out of 7 major tested systems only NTFS fully supports them)
*) File IDs have been the de-facto standard for quite some time on Windows

This isn't a good solution to say "use device ids".  Lets assume Stablebit was correct, and File IDs for NTFS are not stable/unique even then most people are not the application developer and changing the application is not likely.  Microsoft has teams that work on sharepoint and onedrive, I don't think messaging them to say "StableBit requires you to change the application to use Object IDs" is going to gain much ground.  Still given above we can see for NTFS it is incorrect anyway to say it is not stable.

This bug causes excessive data transfer, corruption, data loss, and data leakage with an unknown number of applications.  This problem is heavily compounded for continuous monitoring applications due to the fact DrivePool incorrectly reports file changes to listeners when a file is only read-accessed.   A listener who then checks the File notified about and sees the ID has changed then can open that by ID and do god knows what with the wrong information/data.   Final compounding problem is the fact DrivePool doesn't use remotely unique File IDs basically guaranteeing collision with previous File IDs.  They essentially start counting from 1 and issue from there.  Its a 64 bit field, give a random 64 bit number.  It won't solve excessive data transfer but it would have reduced the horrific impacts of this bug greatly.

Link to comment
Share on other sites

  • 1

"Am I right in understanding that this entire thread mainly evolves around something that is probably only an issue when using software that monitors and takes action against files based on their FileID? Could it be that Windows apps are doing this?"

This thread is about two problems: DrivePool incorrectly sending file/folder change notifications to the OS and DrivePool incorrectly handling file ID.

From what I can determine the Windows/Xbox App problem is not related to the above; an error code I see is 0x801f000f which - assuming I can trust the (little) Microsoft documentation I can find - is ERROR_FLT_DO_NOT_ATTACH, "Do not attach the filter to the volume at this time." I think that means it's trying to attach a user mode minifilter driver to the volume and failing because it assumes that volume must be on a single, local, physical drive?

TLDR if you want to use problematic Windows/Xbox Apps with DrivePool, you'd have to use CloudDrive as an intermediary (because that does present itself as a local physical drive, whereas DrivePool only presents itself as a virtual drive).

"But this still worries me slightly; who's to say e.g. Plex won't suddenly start using FileID and expect consistency you get from real NTFS?"

Well nobody except Plex devs can say that, but if they decide to start using IDs one hopes they'll read the documentation as to exactly when it can and can't be relied upon. Aside, it remains my opinion that applications, especially for backup/sync, should not trust file ID as a sole identifier of (persistent) uniqueness on a NTFS volume, it is not specced for that, that's what object ID is for, and while its handling of file ID is terrible DrivePool does appear to be handling object ID persistently (albeit it still has some problems with birthobject ID and birthvolume ID, however it appears to use zero appropriately for the former when that happens).

P.S. "I have had server crashes lately when having heavy sonarr/nzbget activity. No memory dumps or system event logs happening" - that's usually indicative of a hardware problem, though it could be low-level drivers. When you say heavy, do you mean CPU, RAM or HDD? If the first two, make sure you have all over-clocking/volting disabled, run a memtest and see if problem persists? Also a new stable release of DrivePool came out yesterday, you may wish to try it as the changelog indicates it has performance improvements for high load conditions.

Link to comment
Share on other sites

  • 1
8 hours ago, Shane said:

This thread is about two problems: DrivePool incorrectly sending file/folder change notifications to the OS and DrivePool incorrectly handling file ID.

As I interpreted it, the first is largely caused due to second.

8 hours ago, Shane said:

From what I can determine the Windows/Xbox App problem is not related to the above; an error code I see is 0x801f000f which - assuming I can trust the (little) Microsoft documentation I can find - is ERROR_FLT_DO_NOT_ATTACH, "Do not attach the filter to the volume at this time." I think that means it's trying to attach a user mode minifilter driver to the volume and failing because it assumes that volume must be on a single, local, physical drive?

TLDR if you want to use problematic Windows/Xbox Apps with DrivePool, you'd have to use CloudDrive as an intermediary (because that does present itself as a local physical drive, whereas DrivePool only presents itself as a virtual drive).

Interesting. I won't consider that critical, for me, as long as it creates a gentle userspace error and won't cause covefs to bsod.

8 hours ago, Shane said:

Well nobody except Plex devs can say that, but if they decide to start using IDs one hopes ... should 

That's kind of my point. Hoping and projecting what should be done, doesn't help anyone or anything. Correctly emulating a physical volume with exact NTFS behavior, would. I strongly want to add I mean no attitude or any kind of condescension here, but don't want to use unclear words either - just aware how it may come across online. As a programmer working with win32 api for a few years (though never virtual drive emulation) I can appreciate how big of a change it can be to change now. I assume DrivePool was originally meant only for reading and writing media files, and when a project has gotten as far as this has, I can respect that it's a major undertaking - in addition to mapping strict NTFS proprietary behavior in the first place - to get to a perfect emulation.

8 hours ago, Shane said:

When you say heavy, do you mean CPU, RAM or HDD? If the first two, make sure you have all over-clocking/volting disabled, run a memtest and see if problem persists? Also a new stable release of DrivePool came out yesterday, you may wish to try it as the changelog indicates it has performance improvements for high load conditions.

It's just a particular hard case of figuring out which hardware is bugging out. I never overclock/volt in BIOS - I'm very aware of its pitfalls and also that some MB may do so by default - it's checked. If it was a kernel space driver problem I'd be getting bsod and minidumps, always. But as the hardware freezes and/or turns off... smells like hardware issue. RAM is perfect, so I'm suspecting MB or PSU. First I'll try to see if I can replicate it at will, at that point I'd be able to push in/outside the pool to see if DP matters at all. But this is my own problem... Sorry for mentioning it here.

 

Thank you for taking the time to reply on a weekend day. It is what it is, I suppose.

Link to comment
Share on other sites

  • 1
19 hours ago, Thronic said:

That's kind of my point. Hoping and projecting what should be done, doesn't help anyone or anything. Correctly emulating a physical volume with exact NTFS behavior, would. I strongly want to add I mean no attitude or any kind of condescension here, but don't want to use unclear words either - just aware how it may come across online. As a programmer working with win32 api for a few years (though never virtual drive emulation) I can appreciate how big of a change it can be to change now. I assume DrivePool was originally meant only for reading and writing media files, and when a project has gotten as far as this has, I can respect that it's a major undertaking - in addition to mapping strict NTFS proprietary behavior in the first place - to get to a perfect emulation.

As I understood it the original goal was always to aim for having the OS see DrivePool as of much as a "real" NTFS volume as possible. I'm probably not impressed nearly enough that Alex got it to the point where DrivePool became forget-it's-even-there levels of reliable for basic DAS/NAS storage (or at least I personally know three businesses who've been using it without trouble for... huh, over six years now?). But as more applications exploit the fancier capabilities of NTFS (if we can consider File ID to be something fancy) I guess StableBit will have to keep up. I'd love a "DrivePool 3.0" that presents a "real" NTFS-formatted SCSI drive the way CloudDrive does, without giving up that poolpart readability/simplicity.

On that note while I have noticed StableBit has become less active in the "town square" (forums, blogs, etc) they're still prompt re support requests and development certainly hasn't stopped with beta and stable releases of DrivePool, Scanner and CloudDrive all still coming out.

Dragging back on topic - any beta updates re File ID, I'll certainly be posting my findings.

Link to comment
Share on other sites

  • 1

Shane, as always, has done a great job summarizing everything and I certainly agree with most of it.  I do want to provide some clarification, and also differ on a few things:

*) This is not about DrivePool being required to precisely emulate NTFS and all its features, that is probably a never going to happen.  At best DrivePool may be able to provide a driver level drive implementation that could allow it to be formatted in the way Shane describes CloudDrive does. One of the things this critical bug is made worse by is the fact DrivePool specifically doesn't implement VSS or similar

*) The two issues here are not the same, or one causing the other.  They are distinct, but the incorrect file changed bug makes the FileID problem potentially so much worse (or maybe in unlucky situations it to happen at all).   Merely by browsing a folder can cause file change notifications to fire on the files on it in certain situations.  This means unmodified files an application listening to the notification would believe have been modified.  It is possible if this bug did not exist then only would written files have the potential for corruption rather than all files.


These next two points are not facts but IMO:

*) DrivePool claims to be NTFS if it cannot support certain NTFS features it should break them as cleanly as possible (not as compatible as possible as it might currently).  FileID support should be as disabled as possible by DrivePool.  Open by file ID clearly banned.  I don't know what would happen if FileID returned 0 or claimed not available on the system even thought it is an NTFS volume.  There are things DrivePool could potentially due to minimize the fatal damage this FileID bug can cause (ie not resetting to zero) but honestly even then all FileID support should be as turned off as possible.   If a user wants to enable these features DrivePool should provide a massive disclaimer about the possible damage this might cause.

*) DrivePool has an ethical responsibility to its users it is currently violating.  It has a feature that can cause massive performance problems, data loss, and data corruption.  It has other bugs that accelerate these issues.  DrivePool is aware of this, they should warn users using these features that unexpected behaviors and possible irreversible damage could occur.  It annoys me the effort I had to exert to research this bug.  As a developer if I had a file system product users were paying for and it could cause silent corruption I would find this highly disturbing and do what I could to protect other users.   It is critical to remember this can result in corruption of the worst kind.  Corruption that normal health monitoring tools would not detect (files can still be read and written) but it can corrupt files that are not being 'changed' in the background at random rates.  It wouldn't matter if you kept daily backups for 6 months if you didn't detect this for 9 months you would have archived the corruption into those backups and have no way of recovering that data.  It can happen slowly and literally only validating the file contents against some known good would show it.  Now StableBit may feel they skirt some of the responsibility as they don't do the corruption directly, some other application relying on drivepool's drive acting as NTFS says it will, and DrivePool tries to pretend to do to get the data loss.  The problem is drivepools incorrect implementation is the direct reason this corruption occurs, and the applications that can cause it are not doing anything wrong.

Link to comment
Share on other sites

  • 1

Thanks @MitchC

Maybe they could passthrough the underlying FileID, change/unchanged attributes from the drives where the files actually are - they are on a real NTFS volume after all. Trickier with duplicated files though...

Just a thought, not gonna pretend I have any solutions to this, but it has certainly caught my attention going forwards. Does using CloudDrive on top of DrivePool have any of these issues? Or does that indeed act as a true NTFS volume?

Link to comment
Share on other sites

  • 1
3 hours ago, JC_RFC said:

I just don’t understand why they can’t map the last 64bit of object-id to file-id. This seems like a very simple fix to me? If the 128bit object-id is unique for every file then the last 64bit are as well. Just means a limit of unique 18446744073709551615 files for the volume as per ntfs.

Object-id follows RFC 4122 so expect the last 52 bits of those last 64 bits to be the same for every file created on a given host and the first 4 of those last 64 bits to be the same for every file created on every NTFS volume. You'd want to use the 60 bits corresponding to the object-id's creation time and, hmm, I want to say the 4 least significant bits of the clock sequence section? The risk here would be if the system's clock was ever faulty (e.g. dead CMOS battery and/or faulty NTP sync) during object-id creation but that's a risk you're supposed to monitor if you're using object-id anyway.

First catch would be the overhead. Every file in NTFS automatically gets a file-id as part of being created by the OS and it's a computationally simple process; creating an object-id (and associated birthobject-id, birthvolume-id and domain-id) is both optional and computationally more complex. That said, it is used in corporate environments (e.g. for/by Distributed Link Tracking and File Replication Service); I'd be curious as to how significant this overhead actually is.

Second catch would again be overhead. DrivePool would have to ensure that all duplicates have the same birthobject-id and birthvolume-id, with queries to the pool returning the birthobject-id as the pool's object-id, I think... which either way means another subroutine call upon creating each duplicate. Again, I don't know how significant the overhead here would be.

But together they'd certainly involve more overhead than just "hey grab file-id". How much? Dunno, I'm not a virtual file system developer.

... I'd still want to beta test (or even alpha test) it. :)

Link to comment
Share on other sites

  • 1
On 10/14/2024 at 10:24 AM, Shane said:

Note that if any application is using File ID to assume "this is the same file I saw last whenever" (rather than "this is probably the same file I saw last whenever") for any volume that has users or other applications independently able to delete and create files, consider whether you need to start looking for a replacement application. While the odds of collision may be extremely low it's still not what File ID is for and in a mission-critical environment it's taunting Murphy.

Sorry but even in my mission-not important environment I am not a fan of data loss or leakage.   Also, extremely low is an understatement.   NTFS supports 2^32 possible files on a drive.  The MFT file index is actually a 48 bit entry, that means you could max out your new MFT records 65K times prior to it needing to loop around.  The sequence number (how many times that specific MFT record is updated) is an additional 16 bits on its own so if you could delete and realloc a file to the exact same MFT record you still would need to do so with that specific record 65K times.  If an application is monitoring for file changes, hopefully it catches one of those:)

It is nearly impossible to know how an application may use FileID especially as it may only be used as a fallback due to other features drivepool does not implement and maybe they combine FileID with something else.   If an application says hey I know file 1234  and on startup it checks file 1234. If that file exists it can be near positive its the same file if is gone it simply removes file 1234 from its known files and by the time 1234 it reused it hasn't known about it in forever.

The problem here is not necessarily when FileIDs change  id wager most applications could probably handle file ids changing even though the file has not fine (you might get extra transfer, or backed up extra data, or performance loss temporarily).  It is the FileID reuse that is what leads to the worst effects of data loss, data leakage, and corruption.  The file id is 64 bits, the max file space is 32 bits (and realistically most people probably have a good bit fewer than 4 billion files). DrivePool could randomly assign file ids willy nilly every boot and probably cause far fewer disasters.  DrivePool could use underlying FIleIDs likely through some black magic hackery.  The MFT counter is 48 bit, but I doubt those last 9 bits are touched on most normal systems.   If DrivePool assigned an incremental number to each drive  and then overwrote those 9 bits of the FileID from the underlying system with the drive ID you would support 512 hard drives in one drive 'pool' and still have nearly the same near zero file collision of FileID, while also having a stable file ID.   It would only change the FIleID if a file moved in the background from one drive to another(and not just mirrored).   It could even keep it the same with a zero byte ID file left behind on a ghost folder if so desired, but the file ID changing is probably far less a problem.  A backup restore program that deleted the old file and created it again would also change the FileID and I doubt that causes issues.

That said, it is not really my job to figure out how to solve this problem in a commercial product.

As you mentioned back in December it is unquestionable that drivepool is doing the wrong thing:

On 12/12/2023 at 8:01 PM, Shane said:

https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/2d3333fe-fc98-4a6f-98a2-4bb805aff407

https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/98860416-1caf-4c80-a9ab-8d61e1ccf5a5

In short, if a file system cannot provide a file ID that is both unique within a given volume and stable until deleted, then it must set the field to either zero (indicating the file system does not support file IDs) or maxint

it uses MUST in caps.  

My problem isn't that this bug exists (although that sucks). My problem is this has been and continues to be handled exceptionally poorly by Stablebit even though it can pose significant risk to users without them even knowing it.  I likely spent more of my time investigating their bug then they have.  We are literally looking at nearly two years now since my initial notification and users can make the same mistakes now as back then despite the fact they could be warned or prevented from doing so.

Link to comment
Share on other sites

  • 1

I agree with you Mitch, and likely the solution for me will be to dissolve my pool and go back to multiple drives. I recently reduced down to 2 large drives anyway (+1 parity for snapraid) so it's no longer worth the hassle of maintaining a drivepool for me. If this issue was sorted I would leave it setup, but the negatives are outweighing the positives for me now.

For the record i pursued Stablebit for some time last year and I attach their outcome here:

https://stablebit.com/Admin/IssueAnalysis/28847

Which in summary was, no problem here, nothing wrong with what we are doing, we are following recommended practice.

I don't agree with their assessment, or Stablebit's interpretation of Microsoft's SHOULD, which in my mind only says Should because it depends on the filesystem, as per the table below where some filesystems don't have unique and stable file_id, but NTFS certainly does. In any case, a good developer SHOULD aspire to do better and if you are trying to emulate something that Should do something, then if it were me, I would do it, not treat it as an excuse to not do it.

64 bit file ID

Generate

Stable

Unique

FAT

Yes

No

No

EXFAT

Yes

No

No

FAT32

Yes

No

No

Cdfs

No

n/a

n/a

UDFS

Yes

Yes

Yes

NTFS

Yes

Yes

Yes

ReFS

Yes

Yes

Yes

     

Anyway, as you said, to all users of Drivepool who deploy some application that utilizes file_id, then be very careful. It will not behave the same as a NTFS volume.

Link to comment
Share on other sites

  • 1
2 hours ago, Shane said:

currently the most we can do is to change the Override value for CoveFs_OpenByFileId from null to false (see Advanced Settings). At least as of this post date it doesn't fix the File ID problem, but it does mean affected apps should either safely fall back to alternative methods

Mostly.  As I think you mentioned earlier in this thread that doesn't disable FileIds and applications could still get the FileID of a file.  Depending how that ID is used it could still cause issues.  An example below is snapraid which doesn't use OpenByFileID but does trust that the same FileID is the same file.

2 hours ago, JC_RFC said:

What is being warned of here though is if you use any special applications that might expect FileID

For the biggest problems (data loss, corruption, leakage) this is correct.  Of course, one generally can't know if an application is using FileIDs (especially if not open source) it is likely not mentioned in the documentation.  It also doesn't mean your favorite app may not start to do so tomorrow, and then all the sudden the application that worked perfectly for 4 years starts to silently corrupt random data.  By far the most likely apps to do this are backup apps, data sync apps, cloud storage apps, file sharing apps, things that have some reason to potentially try to track what files are created/moved/deleted/etc.  

The other issue (and sure if I could go back in time I would split this thread in two) of the change notification bugs in DrivePool won't directly lead to data loss (although can greatly speed up the process above) .  It will, however, have the potential for odd errors and performance issues in a wide range of applications.  The file change API is used by many applications, not just the app types listed above (which often will use it if they run 24/7) but any app that interfaces with many files at once (IE coding IDE's/compilers, file explorers, music or video catalogs, etc).  This API is common, easy to use for developers, and generally can greatly increase performance of apps as they no longer need to manually check if every file they can just install one event listener on a parent directory and even if they only care about the notifications for some of the files in the directories under it they can just ignore the change events they don't care about.  It may be very hard to trace these performance issues or errors to drive pool due to how they may present themselves.  You are far more likely to think the application is buggy or at fault.

Short Example of Disaster

As it is a complex issue to understand I will give a short example of how FileIDs being reused can be devastating. 

Lets say you use Google Drive or some other cloud backup / sharing application and it relies on the fact that as long as FileID 123 around it is always pointing to the same file.  This is all but guaranteed with NTFS.

You only use Google Drive to backup your photos from your phone, from your work camera, or what have you.   You have the following layout on your computer:

c:\camera\work\2021\OfficialWiringDiagram.png with file ID 1005

c:\camera\personal\nudes\2024Collection\VeryTasteful.png with file ID 3909

c:\work\govt\ClassifiedSatPhotoNotToPostOnTwitter.png with file ID 6050

You have OfficialWiringDiagram.png shared with the office as its an important reason anytime someone tries to figure out where the network cables are going.

Enter drive pool.  You don't change any of these files but DrivePool generates a file changed notification for OfficialWiringDiagram.png.  GoogleDrive says OK I know that file, I already have it backed up and it has file ID 1005.  It then opens File ID 1005 locally reads the new contents, and uploads it to the cloud overriding the old OfficialWiringDiagram.png.  Only problem is you rebooted, so 1005 was OfficialWiringDiagram.png before, but now file 1005 is actually your nude file VeryTasteful.png.  So it has just backed up your nude file into the cloud but as "OfficialWiringDiagram.png", and remember that file is shared to the cloud.  Next time someone goes to look at the office wiring diagram they are in for a surprise.  Depending on the application if 'ClassifiedSatPhotoNotToPostOnTwitter.png' became FileID 1005 even though it got a change notification for the path "c:\camera\work\2021\OfficialWiringDiagram.png" which is under the main folder it monitors ("c:\camera") when it opens File 1005 it instead now gets a file completely outside your camera folder and reads the highly sensitive file from c:\work\govt and now a file that should never be uploaded is shared to the entire office.

 

Now you follow many best practices.  Google drive you restrict to only the c:\camera folder, it doesn't backup or access files anywhere else.  You have a Raid 6 SSD setup incase of drive failure, and image files from prior years are never changed, so once written to the drive they are not likely to move unless the drive was de-fragmented meaning pretty low chance of conflicts or some abrupt power failure causing it to be corrupted.   You even have some photo scanner that checks for corrupt photos just to be safe.  Except none of these things will save you from the above example.   Even if you kept 6 months of backup archives offsite in cold storage (made perfectly and not effected by the bug) and all deleted files are kept for 5 years, if you don't reference OfficialWiringDiagram.png but once a year you might not notice it was changed and the original data overwritten until after all your backups are corrupted with the nude and the original file might be lost forever.

FileIDs are generally better than relying on file paths, if they used file paths when you renamed or moved file 123 to a new name in the same folder it would break anyone you previously had shared the file with if only file names are used.   If instead when you rename "BobsChristmasPhoto.png" to "BobsHolidayPhoto.png" the application knows it is the file being renamed as it still has File ID 123 then it can silently update on the backend the sharing data so when people click the existing link it still loads the photo.  Even if an application uses moderate de-duplication techniques like hashing the file to tell if it has just moved, if you move a file and slightly change it (say you clear the photo location metadata out that your phone put there) it would think it is an all new file without File IDs.

FileID collisions are not just possible but basically guaranteed with drive pool.  With the change notification bug a sync application might think all your files are changing often as even reading the file or browsing the directory might trigger a notification it has changed.  This means it is backing up all those files again, which might be tens of thousands of photos.   As any time you reboot the File ID changes that means if it syncs that file after the reboot uploading the wrong contents (as it used File ID) and then you had a second computer it downloaded that file to you could put yourself in a never ending loop for backups and downloads that overrides one file with another file at random.  As the FileID it was known last time for might not exist when it goes to back it up (which I assume would trigger many applications to fall back to path validation) only part of your catalog would get corrupted each iteration.  The application might also validate that if the file is renamed it stayed within the root directory it interacts with.  This means if your christmas photo's file ID now pointed to something under "c:\windows" it would fall back to file paths as it knows that is not under the "c:\camera" directory it works with.

This is not some hypothetical situation these are actual occurrences and behaviors I have seen happen to files I have hosted on drivepool.  These are not two-bit applications written by some one person dev team these are massively used first party applications, and commercial enterprise applications.

 

2 hours ago, haoma said:

This is very complicated stuff, should i stop using drive pool and wait for an update?

If you can and you care about your data I would.  The convenience of drivepool is great, there are countless users it works fine for (at least as far as they know), but even with high technical understanding it can be quite difficult to detect what applications are effected by this. 

If you thought you were safe because you use something like snapraid it won't stop this sort of corruption.  As far as snapraid is concerned you just deleted a file and renamed another on top of it.  Snapraid may even contribute further to the problem as it (like many) uses the windows FileID as the Windows equivalent of an inode number https://github.com/amadvance/snapraid/blob/e6b8c4c8a066b184b4fa7e4fdf631c2dee5f5542/cmdline/mingw.c#L512-L518  .  Applications assume inodes and FileIDs that are the same as before are the same file.  That is unless you use DrivePool, oops.  

Apps might use timestamps in addition to FileIDs although timestamps can overlap say if you downloaded a zip archive and extracted it with Windows native (by design choice it ignores timestamps even if the zip contained them).

SnapRAID can even use some advanced checks with syncing but in a worst case where a files content has actually changed but the FileID in question has the same size/timestamp SnapRAID assumes it is actually unmodified and leaves the parity data alone.  This means if you had two files with the same size/timestamp anywhere on the drive and one of them got the FileID of the other it would end up with incorrect parity data associated with that file.   Running a snapraid fix could actually result in  corruption as snapraid would believe the parity data is correct but the content on disk it thinks go with it does not.  Note:  I don't use snapraid but was asked this question and reading the manual here and the source above I believe this is technically correct.  It is great SnapRAID is open source and has such technical documentation plenty of backup / sync programs don't and you don't know what checking they do.

Link to comment
Share on other sites

  • 0

I'll post it here too.  

There is a fix in the latest betas involving memory corruptions of file IDs.  

However, ... the issue may also be the wrong API being used: 
 

Quote

... incorrectly using File IDs as persistent file identifiers, which they should not be. File IDs in Windows can change from time to time on some filesystems.

Source: https://learn.microsoft.com/en-us/windows/win32/api/fileapi/ns-fileapi-by_handle_file_information

The identifier that is stored in the nFileIndexHigh and nFileIndexLow members is called the file ID. Support for file IDs is file system-specific. File IDs are not guaranteed to be unique over time, because file systems are free to reuse them. In some cases, the file ID for a file can change over time.

If this is the case, then it is expected behavior.

The correct API to use to get a persistent file identifier is FSCTL_CREATE_OR_GET_OBJECT_ID or FSCTL_GET_OBJECT_ID: https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ni-winioctl-fsctl_create_or_get_object_id

Object IDs are persistent and do not change over time.

We support both Object IDs and File IDs.

 

Link to comment
Share on other sites

  • 0

Gonna wake this thread just because I think it may still be really important.

Am I right in understanding that this entire thread mainly evolves around something that is probably only an issue when using software that monitors and takes action against files based on their FileID? Could it be that Windows apps are doing this? quote from Drashna in the XBOX thread: "The Windows apps (including XBOX apps) do some weird stuff that isn't supported on the pool.". It seems weird to me, why take the risk of not doing whatever NTFS would normally do, or is the exact behavior proprietary and not documented anywhere?

I only have media files on my pool, has since 2014 without apparent issues. But this still worries me slightly; who's to say e.g. Plex won't suddenly start using FileID and expect consistency you get from real NTFS? Only issue I had once was with rclone complaining about time stamps, but I think that was read striping related.

I have had server crashes lately when having heavy sonarr/nzbget activity. No memory dumps or system event logs happening, so it's hard to troubleshoot, especially when I can't seem to trigger it on demand easily. The entire server usually freezes up, no response on USB ports or RDP, but the machine often still spins its fans until I hard reset it. Only a single time has it died entirely. I suspect something hardware related, but these things keep gnawing on my mind... what if, covefs... I always thought it was exact NTFS emulation. I haven't researched DrivePool in recent years, but now I keep finding threads like this...

Seems to me we may have to be careful about what we use the pool for and that it's dangerous conceptually to not follow strict NTFS behavior. Who's to say Windows won't one day start using FileID for some kind of attribute or journaling background maintenance, and cause mayhem on its own. If FileID is the path of least resistance in terms of monitoring and dealing with files, we can only expect more and more software and low level API using it. 

This is making me somewhat uneasy about continuing to use DrivePool.

Link to comment
Share on other sites

  • 0
9 hours ago, MitchC said:

I don't know what would happen if FileID returned 0 or claimed not available on the system even thought it is an NTFS volume.

I should think good practice would be to respect a zero value regardless (one should always default to failing safely), but the other option would be to return maxint which means "this particular file cannot be given a unique File ID" and just do so for all files (basically a way of saying "in theory yes, in practice no").

DrivePool does have an advanced setting CoveFs_OpenByFileId however currently #1 it defaults to true and #2 when set to false any querying file name by file id fails but querying file id by file name still returns the (broken) file id instead of zero. I've just made a support request to ask StableBit to fix that.

Note that if any application is using File ID to assume "this is the same file I saw last whenever" (rather than "this is probably the same file I saw last whenever") for any volume that has users or other applications independently able to delete and create files, consider whether you need to start looking for a replacement application. While the odds of collision may be extremely low it's still not what File ID is for and in a mission-critical environment it's taunting Murphy.

8 hours ago, Thronic said:

Maybe they could passthrough the underlying FileID, change/unchanged attributes from the drives where the files actually are - they are on a real NTFS volume after all. Trickier with duplicated files though...

A direct passthrough has the problem that any given FileID value is only guaranteed to be unique within a single volume while a pool is almost certainly multiple volumes. As the Microsoft 64-bit File ID (for NTFS, two concatenated incrementing DWORDs, basically) isn't that much different from DrivePool's ersatz 64-bit File ID (one incrementing QWORD, I think) in the ways that matter for this it'd still be highly vulnerable to file collisions and that's still bad.

... Hmm. On the other hand, if you used the most significant 8 of the 64 bits to instead identify the source poolpart within a pool, theoretically you could still uniquely represent all or at least the majority of files in a pool of up to 255 drives so long as you returned maxint for any other files ("other files" being any query where the File ID in a poolpart set any of those 8 bits to 1, the file only resided on the 256th or higher poolpart or no File ID returned by the first 255 poolparts was not maxint) and still technically meet the specification for Microsoft's 64-bit File ID? I think it should at least "fail safely" which would be an improvement over the current situation?

Does that look right?

8 hours ago, Thronic said:

Does using CloudDrive on top of DrivePool have any of these issues? Or does that indeed act as a true NTFS volume?

@Christopher (Drashna) does CloudDrive use File ID to track the underlying files on Storage Providers that are NTFS or ReFS volumes in a way that requires a File ID to remain unique to a file across reboots? I'd guess not, and that CloudDrive on top of DrivePool is fine, but...

Link to comment
Share on other sites

  • 0

Unfortunately it has been over a year and no change.
I feel it is a case of the developer thinks they are right and the rest of the world is wrong.

To any users out there of Drivepool, the warning is clear, be VERY careful how you interact with the data on a Drivepool volume. Any software you use that interacts with fileid on a Drivepool volume can give unintended consequences, from minor performance loss through to serious data loss.

I just don’t understand why they can’t map the last 64bit of object-id to file-id. This seems like a very simple fix to me? If the 128bit object-id is unique for every file then the last 64bit are as well. Just means a limit of unique 18446744073709551615 files for the volume as per ntfs.

Link to comment
Share on other sites

  • 0

Fair enough, not the last 64bits.

If there was no Object-ID then I agree with the overhead point. However the Drivepool developer has ALREADY gone to all the trouble and overhead of generating unique 128bit Object-ID's for every file on the pool.

This is why I feel it should be trivial to now also populate the File-ID with a unique 64bit value derived from this Object-ID. All the hard work has already been done.

No argument with the beta/alpha test. I would happily test this way as well. At present though we have a broken File-ID system.

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...