Jump to content

StableBit DrivePool - Controlling Folder Placement


Alex

Recommended Posts

Long story short: folder placement should come "off" by default, and it should be completely up to you whether you want the extra administrative responsibility of using it to get that "extra mile" out of your pools.

 

Long story long: folder placement is primarily a tool for (1) pools with large number of drives and/or (2) pools that are not using full duplication.

 

1. As the number of drives grow, so does the impact of physically scattered files on power consumption (inefficiently waking multiple drives from standy) and latency (waiting while those drives come out of standby) and bandwidth (if multiple users are accessing different areas of the pool, physical scattering increases the odds of any two users competing for the same physical disks) from negligible to significant.

 

2. In dealing with storage, there is a mantra ignored at peril: "RAID Is Not Backup". Since drives are not free, sometimes we have to choose between having enough drives for a fully duplicated pool and enough drives for the backup(s) - and for vital data you should always choose the latter. However, some backup software may lack certain features, such that performing a partial restore of physically scattered files is awkward (or to be more blunt, a right pain in the butt).

 

Also, 3. Properly implemented, it provides an efficient alternative to the current awkward workaround of splitting files into multiple pools (which has its own drawbacks).

Link to comment
Share on other sites

There is also a missing point. The main reason behind the balancers in the first place is so we can be OCD about the file placement. :) 

 

But yeah, it's not for everyone. And there are some that definitely want this feature. Much like the already optional balancers. And yes, they would be able to be "toggled" and the priority rearranged as well.

Link to comment
Share on other sites

 

 

And yeah, OCD is a big factor in why I love DrivePool. The very "hands off" approach is great for those that are OCD. :)

 
And I have enough OCD to share with everyone else.  :)
My motherboard has 8 built in Sata ports (6 sata III and 2 Sata II). Also a 2 port Sata II card and a 4 port Sata I card. I'm using 13 of the 14 available ports. 
 
For a long long time I have hoped to be able to replace my 2TB drives with 3 TB drives but It never quite happened until now. Fortunately it has taken me so long to upgrade that price wise, I have been able to skip the 3TB and upgrade straight to 4TB. Two of them are installed and another three should be on there way to me today or tomorrow.  
So I have around 15TB of data on my system and its all precious to me. I don't have the storage overhead to duplicate everything and probably 12TB of my data is MKV movies. The vast majority of that is not that important. If a hard drive dies I will have no problem re-downloading Terminator 2, Independence Day, Fight Club etc...
But what about the much rarer movies such as an early 80's Australian movie 720p that someone ripped off the TV. I'm pretty sure I will never find that again. 
So for me the choice is either migrate all my data onto a drive that must be duplicated (that should keep me busy for a good few days), or tell Drivepool to duplicate that movie.. Its a no-brainer and for me at least THE one thing that I badly need from Drivepool. Then with some organisation afterwards I could probably upgrade my 2TB server backup drive to a 4TB and even include the above mentioned difficult to get stuff in Server backup as well as having them duplicated. Media and OCD  heaven for me. :)
When the other 4TB drives arrive, the two cheapo sata cards will get ripped from the system and replaced with a 2 or 4 port Sata III card when needed. This will give me scope for another 8-16TB of space that I will probably struggle to fill until I'm 6 foot under.
 
Please. please, please Alex.. Bring it on soon!!! :D
 
 
Edit... I'm really getting myself confused now... I do know know that I can duplicate any folder already but the ability to control placement would be optimal for me. I just really want to know that all my duplicated music & movies reside on drives 1 & 4 etc... 
Link to comment
Share on other sites

Nevertheless, I would think that any OCD person would like to get a clear signal if and when, e.g.,

- Drives 1 & 4 are (or the smaller of the is) are full and files are spilling over onto other drives.

- Drive 1 or 4 is failing and/or is removed which would violate the folder placement directive etc.

 

But yeah, I'm getting the point of folder placement now.

Link to comment
Share on other sites

Nevertheless, I would think that any OCD person would like to get a clear signal if and when, e.g.,

- Drives 1 & 4 are (or the smaller of the is) are full and files are spilling over onto other drives.

- Drive 1 or 4 is failing and/or is removed which would violate the folder placement directive etc.

 

But yeah, I'm getting the point of folder placement now.

 

 

Again this depends on organisation and on an individuals system and how they use it. I personally would never let the drives get to the position they would overflow. In my case Server 2012r2 and from past experience WHS2011 gives plenty of notice of such issues and thoughtful initial setup of the existing balancers will simply take care of the rest. The server backup on the operating systems mentioned are a pita to initially setup. You cannot just select which folders to back up. You have to browse to each individual serverpoolpart folder on each drive and select the relevant folders. With a 2 or 3 drive system that's no big deal but with 10-15 drives that's pretty time consuming. If music is scattered over a dozen drives and you miss selecting it on one of the drives backup will fail or at least be incomplete. Then it's back to all those drives to see what was missed. Mutiply that by the 10 or 15 different server folders and its a nightmare. :o If my music is on two drives, then that's  two drives i need to check that it's being backed up properly.  

 

Having replaced the numerous 2TB drives with the 4TB ones I personally will have plenty of scope to add another 4TB drive as and when needed. The folder placement will simply be another option for those who feel they need it. 

 

Don't get me wrong. I am currently using two pools just to make sure all my flac files are duplicated but to be honest for me at least it just negates the benefit of having a pool in the first place. 

 

Anyway i'm glad you are starting to realise why limiting folder placement will be a massive benefit to some (if not all) of us. In years to come as your system grows you may also come to realise just how much of a benefit. Though hopefully you will never get as OCD as Drashna and myself. :lol:

 

Finally with regard to the drive failing, I actually see limiting placement as a good thing. My personal plan is to have all my "must not lose because i probably cannot replace data" duplicated over two drives. If one of those drives fails. My full data is still on the other drive. A much better scenario than a random drive failing and losing a couple of tracks off most if not every single album.

Link to comment
Share on other sites

Hahaha, I will _never_ (oops, shouldn't use that word) use folder placement. I understand the backup issue though, in fact, that is _why_ I only have two 2TB drives in my Pool: I get the stability of RAID 1 and can still backup everything using WHS backupsoftware by backing up one of the two drives. A 3 drive-pool would no longer allow me to do that, unless I were to use folder placement. I'll simply not let it grow any larger, LOL.

 

I believe WS2012E already allows larger server backups but this all is a bridge I'll cross when I find it.

Link to comment
Share on other sites

@daveyboy37: Very, very well put!

 

@Umfriend: :)

And to each their own. But that is why we have a list of "extra" balancers, the framework for balancing, API for developing your own (if you're inclined), etc. If you're noticing a theme, "choice" is a big consideration for DrivePool, and we hope that it is appreciated. :)

 

And yes, Server 2012 allows for large backups. Specifically, the backup engine has always (well, the "Windows Backup" feature, not ntbackup) used VHDs for storing the backups. If you look it up, VHD's were limited to 2TB size, until Server 2012, which introduced the VHDX format which allows for a 64TiB per VHDX file. :)

Link to comment
Share on other sites

I think I would find a "Try to keep files in directory X on disk Y" or "Try to keep directories together (In general or below file or directory size Z)" option extremely useful.

 

Hypothetical Situation:  I have a bunch of ripped CDs stored on my home server.  It's a directory (named Music) of many smaller directories (named Album Title) made up of smaller files (Each Song).  DrivePool spreads these songs evenly between my two drives.   If one drive dies, I could quite possibly lose half my songs from every album, meaning I would need to re-rip all of my CDs to restore my collection.  

 

Ideally I would have easily accessible backups, or have DrivePool set to keep multiple copies of everything, but that does not always happen.  I don't mind taking a calculated risk for some low priority files, but I would still prefer to minimize the amount of work it takes to recover from a failure.

 

In my view, for sets of files that depend on each other, my exposure to damage to the set goes up the more dispersed the set is.  Meaning that for non-duplicated sets of files, DrivePool could be riskier than non-pool storage.  In the above example, a single drive failure could damage all of my albums.  If my music storage was clumped by album, worst case a single drive failure would cause the complete loss of half my albums.

 

Both failures involve data loss, but depending on the situation, one can be much more painful than the other to recover from.  Even with backups.

 

Related question: 

 

I'm relatively new to using DrivePool.  Does DrivePool have a way of keeping track of what files are on what drive, so in the event of a failure, I could figure out which sets of files (or albums) were affected?  Or do I need to be keeping track of that on my own?  For the most part its easy to tell if a data set is whole, but some things are harder than others.  Using the music example, It's easy to tell if a track in the middle of an Album is missing, but less easy to tell if the last track disappears.

Link to comment
Share on other sites

Indeed, the scenario you describe is one of the reasons why the mantra "RAID is not backup" exists.

 

Q. Does DrivePool have a way of keeping track of what files are on what drive, so in the event of a failure, I could figure out which sets of files (or albums) were affected?  Or do I need to be keeping track of that on my own?

 

A. No - it does not maintain such a list. Yes - you would need your own method of keeping track.

Link to comment
Share on other sites

Thanks for the clarification.  I realize RAID is not backup, but in the event catastrophic RAID failure, I at least know everything in the array is gone and that I would need to restore everything from backup.

 

It seems like with DrivePool during a drive failure I could lose some data and not be sure what data was lost or what folders were affected (unless I keep track of it externally, or compare all of my data against a backup).  The problem would be exacerbated by how DrivePool keeps sets of files spread across multiple drives.

 

The Ordered File Placement plugin is a good start at fixing that, but a bit more control would be greatly appreciated!

 

I'll have to look into what indexing programs are available to keep track of my files.

Link to comment
Share on other sites

Hi all, made an account expressly to respond to this thread.

I totally would love the ability to control folder placement.  I've been looking for a feature like this forever!

 

Scenario:

Force Action movie folder to physical Disk 1

Force Scifi movie folder to physical Disk 2

Physical disk 1 and 2 appear as a single disk within explorer

 

- If physical disk 1 fails, I only lose my Action movies.

- If phyiscal disk 2 fails, I only lose Scifi movies.

- If disk 1 gets full and I try to write to Action movies, I get "disk full message"

- If disk 2 gets full and I try to write to Scifi movies, I get "disk full message"

Link to comment
Share on other sites

That almost sounds like a (RDBMS) view on two tables through a UNION: seperate objects merged only in presentation.

 

Anyway, I wonder whether the folder placement parameters would be stored on each disk. The reason I am asking is that I am wondering what would happen if such a " partitioned"(?) pool were to be migrated to another machine, would/could DP then start balancing and mess things up (in the case where the disks are not equally full).?

Link to comment
Share on other sites

  • 1 month later...

I am a new user to DrivePool (great product by the way!) and I just wanted to add that I would also be very interested in this feature. Is there any (rough) timeline for when this feature might be delivered?

 

I was actually hoping to implement something similar to this myself using the plugin system, but I have since realised that the plugins are only for balancing and can't control the real-time file placement (please correct me if I'm wrong about that).

Link to comment
Share on other sites

I am a new user to DrivePool (great product by the way!) and I just wanted to add that I would also be very interested in this feature. Is there any (rough) timeline for when this feature might be delivered?

 

I was actually hoping to implement something similar to this myself using the plugin system, but I have since realised that the plugins are only for balancing and can't control the real-time file placement (please correct me if I'm wrong about that).

 

 

 

Hopefully it may be in the next update  (and soon ...as we seem to have been at 2.1.0.432 for a very very long time)   ;).

I should stress that the above comment is purely wishful thinking rather than any sort of insider knowledge. 

Link to comment
Share on other sites

There are a lot of internal builds more recent than build 432 (we are up to 483), but they haven't been pushed out as they are mostly bug fixes that haven't been thoroughly tested yet. (well, aside from by me, because I live on the bleeding edge!).

 

Alex is working on getting these updated versions pushed out because it has been a long time. 

 

 

As for this balancer/feature, Alex is hard at work trying to get this finished. But bugs have a priority. And we've hit some serious road bumps (bugs) which have slowed development down more than we would have liked.

Link to comment
Share on other sites

Another vote, this would be an amazing feature.  Wish I'd seen this thread sooner, I've been waiting and waiting for a new beta - checking every few days for months now in hopes of something and finally decided to check the forum and see if development was still going on.

 

Rule based file placement would be a killer feature, taking the current "ordered file placement" abilities of the current plugin to the next step in granularity. I just hate related files to get scattered all over the place.  Ordered file placement has cut down on that problem significantly since it fills up a disk at a time, but its not perfect obviously and I end up doing way more manual file relocating than I'd like to.

Link to comment
Share on other sites

Well guys this is now fully implemented, and very much untested :)

 

Here's the change log for this feature: https://stablebit.com/Admin/IssueAnalysis/2165

 

You can download the latest internal (untested) BETA here: http://dl.covecube.com/DrivePoolWindows/beta/download

 

Latest BETA is 2.1.0.491 as of this writing.

 

I'll have much more to say on how it all works in a future blog post, once I test it a bit more thoroughly and release a public BETA up on stablebit.com. But you can give it a whirl today if you're feeling adventurous.

Link to comment
Share on other sites

Hopefully it may be in the next update  (and soon ...as we seem to have been at 2.1.0.432 for a very very long time)   ;).

I should stress that the above comment is purely wishful thinking rather than any sort of insider knowledge. 

 

The main reason why we're stuck at 432 is because I was trying to fix all of the important bugs that were reported as of the last release. We have a brand new system for keeping track of reported issues, because frankly I got overwhelmed with the feedback from the last release. Since than I've been going through the bugs and have been trying to meticulously identify each issue and address it. Not that there are show-stopping bugs in 2.0.0.420, but I want the 2.1.X version to be an improvement in stability as well as add something significantly new.

 

There have been a whole lot of fixes since 432 (http://dl.covecube.com/DrivePoolWindows/beta/download/changes.txt)

 

The next public BETA is my priority now that per-folder (and pattern based) balancing is implemented. Hopefully it won't take more than a few weeks to get everything tested and published.

 

I'm also actively working on "Product 3" which will integrate very nicely with StableBit DrivePool and will add significant value to it. So there are some amazing things in the works for the future.

 

Thank you for your continued support.

Link to comment
Share on other sites

The main reason why we're stuck at 432 is because I was trying to fix all of the important bugs that were reported as of the last release. We have a brand new system for keeping track of reported issues, because frankly I got overwhelmed with the feedback from the last release. Since than I've been going through the bugs and have been trying to meticulously identify each issue and address it. Not that there are show-stopping bugs in 2.0.0.420, but I want the 2.1.X version to be an improvement in stability as well as add something significantly new.

 

There have been a whole lot of fixes since 432 (http://dl.covecube.com/DrivePoolWindows/beta/download/changes.txt)

 

The next public BETA is my priority now that per-folder (and pattern based) balancing is implemented. Hopefully it won't take more than a few weeks to get everything tested and published.

 

I'm also actively working on "Product 3" which will integrate very nicely with StableBit DrivePool and will add significant value to it. So there are some amazing things in the works for the future.

 

Thank you for your continued support.

 

Thank you to both yourself and Drashna for the explanation. I guess an end user with a very stable system doesnt see all the bugs that need fixing in the background. :P

 

 Will test out the new release as soon as i get home from work.

 

Oh and "product 3" !!!   Awesome news! :)

Link to comment
Share on other sites

×
×
  • Create New...