Hi,
we're testing drivepool to determine if it might be an suitable solution to keep a second online copy and backup.
We're dealing with huge numbers of small files (500kb - 25mb) in several "main" folders each with 4096 subfolders. Now we got quite a few of those main folders with between 10 and 50 million files each and adding 25-100k files per day. Additionally we have to import other systems which may add 10-30 million files at once. We're planning for 24 or 36 drive systems with drivepool and can see the 1 billion file barrier break on single systems.
I've set up a test machine (i7, 32gig and 6x4tb 7.2k drives exclusively for storage) and gave it a taste of around 50-60 million files with duplication (2x). It did not to bad until I restarted the system. It is working but the dashboard is calculating for a couple of hours now and claiming xxx GB are not duplicated (while they are, number is increasing, I assume it has to check _every_ file on disk?). Will give it a couple of hours more to get organized.
Is if drivepool is suitable for those large number of files? I assume there is some kind of database keeping records where which files are placed. Is there a maximum database size or overall file limit per pool? Do those numbers have a known impact on performance? Does this recalculation occur after every restart?
Anyone having experience with similar file numbers?
Thanks!