Jump to content
  • 0

Questions: duplication, SSDs, free space, small files and other


cryodream

Question

A couple of days ago, I tried DrivePool and Scanner for the second time. About a year ago, scanner was not able to read SMART from my HBA and enclosures, so that was a bust and I has trouble with a bug in a release version of drivepool with saving settings while having multiple pools. So I decided no to...
I have never used any raid or parity solution before to protect my data. I tried unRaid from time to time years ago, did not like it. Recently I tried tRAID for the 3rd time and did not like it too, again. Then I looked into Snapraid + DrivePool and decided, nah.
I was always categorically against backup solution because of cost, but finally, I decided - enough. Drivepool and duplication, let's try this out.
 
After a couple of days, I have some questions:
 

1. How safe is to have 2x duplication? With Scanner running safety. I mean, over the past 10 years I've been reading again and again about all kinds of ways of data protection, from hardware raids to parity, to zfs, etc... 99% of the time you always come to the same warning - raid/parity/etc is not safety - backup is. Hence the question. Is Scanner + DrivePool's 2x duplication a way to feel sure that I won't loose data, the next time I'm gonna loose a drive?
 
2. Is drivepool's 2x duplication more secure (because of scanner) or less secure (bugs, whatever), than running 2 pools and syncing between them?
 
3. I was copying TBs of data to the pool from unpooled drives and noticed problems with small files. While copying big folders with lots of tiny files inside, like images, mp3 or even smaller, the speed was climbing up to impossible 300MB/s-400MB/s and then suddenly stopping and hanging for sometimes even a couple of minutes. Then again - copying resumes, speed climbs fast to impossible (I was copying from a single spinner, hence speed should be max ~100MB) and then hangs again. Here's the possible reasons, that come to mind:
a) I am running StableBit.DrivePool_2.2.0.639_x64_BETA. I looked through forums extensively, to find which beta is considered "stable", iirc this was said to be. But still, it's beta...
b.) I was copying from a drive with SMART sector warnings.
c) I am using Directory Opus for all my file management, not the windows Explorer. Dopus is the best thing after sliced bread, imho. I hope there is no problems using dopus and drivepool together. If it's between dopus and drivepool - dopus hands down, sorry. It's too powerful and time saving for me to get back to using Explorer.
d) Maybe it's some settings I should disable/enable to avoid this behavior?

e) Edit #1. Very important! Similar thing happens with big files as well. I was copying large files from the pool to outside the pool. Again - this time from the pool to outside. The files were ~1.5GB in size. For the first 5-10 minutes Dopus was copying at about normal rate ~100MB+/s, but it was stoping and completely hanging every 5-10 seconds or so. I mean, it showed "not responding" when the copying hanged, then it resumed again, like nothing happened. And again and again, for ~10 minutes. Then it normalized and did not stop/hang anymore, but the copying speed dropped to about half of the normal ~50MB/s. Basically, the copying was done from WD Red 3TB to Hitachi 4TB drive, and the only other activity on any of them was Kodi was streaming an episode from WD Red (source) drive, which is insignificant, imho... Both drives are quite new and aOK.

Is this a normal drivepool behavior? Coz it does not look good and is annoying. Hanging the whole Dopus is enough by itself. But having only half the speed of drives while reading from the pool? Not good at all. I hope it's something on my system, that I can fix, please let me know.

Edit #2. 15 minutes later. Copying from the same place to the same place (from the pool, which happens to be same WD 3TB drive to the same Hitachi 4TB) and everything looks ok. Full speed ~100MB+/s and no stops and hangs.

Edit #3. Nah, after a cuple of minutes, speed drops to about half and stays there, again.

WTF is going on? Should I stop using Beta version of drivepool or is it Dopus (it shouldn't be), or this something else?
 
4. I was considering adding an SSD to the pool and using file placement rules to put all metadata for my movies and tv shows on it. I mean using filters like "*.jpg", "*.nfo", etc. I was thinking, the media files are big, and they pretty much never change, but meta is small and gets regularly updated, so that would possibly help with defragmentation. Having meta files separated onto SSD, would it be a better? Or possibly worse, as it would most definitely create an enormous amount of additional "merging work" for the pooling driver?
 
5. Let's say I put all meta onto SSD with file placement filters. If I enable duplication, that probably means I need 2 SSDs to be able to duplicate the meta and keep the SSD bonuses while using the meta files in the pool?
 
6. Today arrived my order of new 6TB drives. My first 6TBs ever. I was wondering, how much free space should I leave on these so not to get NTFS in trouble? I have 4TB, 3TB & 2TB drives in the pool atm. The setting that I use now is set to leave 200GB of free space. Would 200GB be enough for 6TB drive. Or, basically, the age old question: how much free space to leave on the drives? Should I leave the same amount in GBs on all size drives, or should I set the option to leave eg: 5%? I would not like to leave 10%-15%, like some people say, because I'd hate to loose that much space. Especially considering how much space is already lost on duplication. Jeez.
 
7. There are 2 balancers that have an option to set free space to be left on the drives:
Drive Usage Limiter, which only has percent setting. I have this balancer disabled atm.
Prevent Drive Overfill, which I have set to: try not to fill above: 90% or 170GB and Empty to 85% or 200GB. From the info describing how it works, I gather my settings ignore percent and use 170GB and 200GB settings. Am I correct?
 
8. If I want to enable Drive Usage Limiter, but keep size based free space setting, I should  move Prevent Drive Overfill to be above Drive Usage Limiter, right?
 
9. Long story short: I have 4 "groups" of drives:
oldest: 10+ a mix of old 2TB drives
older: 8x Seagate 3TB drives
newer: 16x WD Red 3TB drives
brand new: 6x WD Red 6TB drives
What I would like to achieve is to "control" how the duplication woks. I would like to set "drive groups" of drives:
a) a group that holds non-duplicated files. That's the newest drives (my thinking, the newer the drives, the less chance to fail). That should be easy enough, I guess: just use Drive Usage Limiter and uncheck Duplicated for those drives?
b.) now this second scenario I gather is not possible atm, please confirm. I would like to set 2 groups for duplication: one group of newer drives for "primary copy" and second group for "duplicates". This way I could be sure, that all files have one copy on newer drives. As it stands atm, I guess drivepool would duplicate files onto random drives, hence I would have files with both copies on newer drives and with both copies on oldies...?
c) I would very much like to ask for this feature to set drive groups be added to drivepool in the future in a form of balancer, most probably. From what I gather how drivepool works, that should not be difficult, methinks. Additionally, you could enable setting groups as "primary" and "copies". This would have additional positive effect: you could disable real-time duplication and would always use "primary" group of newer, faster drives in real-time everyday file management, and files would duplicate to slower drives in the background...
 
10.  I would like to ask for a feature in the drivepool's UI to display additional "free space" statistic. What I mean, is maybe it's me, still not used to pooled drives, but it's kinda confusing how much free space I really have in the pool. My pool shows 17.6 TB of free space, but considering that I have 32 drives in the pool atm, all keeping 200GB of free space , I always need to remember and do the math to realize, that actually I have FreeSpace - NumberOfDrives*200GB = RealFreeSpace. That's 6.4 TB in my case atm. But as I'm gonna constantly add drives and sometimes take some out, this number will constantly change and I'll constantly need to do the math.
I propose additional pie chart section, something like "Reserved Free Space" and then drivepool could show Free Space as in now (full) and additionally, calculated real free space left from FreeSpace-Files-ReservedFreeSpace=HowMuchSpaceIhaveLeftOnThePoolForRealz :)
 
11. I've read someplace on the forums, that exporting and importing of drivepool settings is coming. Very nice to hear, as this feature imho should be in every software, period. Also, why not save pool settings on the pool drives themselves? Duplicate for safety. Then, after windows reinstall, or after moving to a new machine, just pop-in the drives, and pool spins up with all the settings...
I was really surprised when I first realized that it wasn't this way already. I mean it as a compliment, actually. DrivePool and Scanner just have this quality feel and a whiff of some very good thinking and work put into in to it. Hence, I'm seriously baffled about annoying inability to quickly get away with the settings.
 
 
Oh, and CloudDrive? Seriously? One programmer doing this? All I can say - F**k you, Bitcasa. Anyone who knows what bitcasa was?is?whocares..., knows what I mean. F**k you, Bitcasa, eat your heart out...

Edited by cryodream
Link to comment
Share on other sites

3 answers to this question

Recommended Posts

  • 0

You sure do have a lot of questions (that's not a bad thing at all, and we'd rather you ask!).  As in the other thread (for Scanner) I"ll try to answer everything, but if I miss anything, do let me know!

 

  1. As these option products have mentioned, duplication is a form of redundancy. Redundancy tries to protect you from data loss due to a failed drive.  However, it doesn't protect you against accidents. If you delete a file, it's gone from both drives. 
    Any important data should be backed up to a different location (in fact, a 3-2-1 backup strategy for important files is a very good idea).
    But having duplication enabled, and StableBit Scanner installed should help protect you from data loss.  As in your other post, I mentioned that I lost ~12 drives. StableBit Scanner saved most of the data as it started evacuating the drives. A couple failed right after I pulled them from the pool. (as in they would no longer power on, or caused the system to become unstable when attached).  So, yes, it definitely can prevent data loss. 
     
  2. More secure. The "real time duplication" feature, which is enabled by default, writes to "both" destination disks in parallel. That means the files are ... well, duplicated in real time. There is no delay in protection of the new or modified files. 

    But running a second pool (such as on a USB enclosure or SAS enclosure) for offsite back is a good idea, and you can run that as a second pool. 
     
  3. If the files are being written in parallel, they may end up be written to different physical disks, and that could account for the very high speeds. And I think that Directory Opus may be doing that.
    Otherwise, it may be a "glitch" with how the file transfer speeds are measured. 

    And it should report the speed of the slowest drive being copied/moved too. But if you're concerned, enable file system logging, copy files over and then upload the files to us.
    http://wiki.covecube.com/StableBit_DrivePool_2.x_Log_Collection
     
  4. There should be no issues with doing so, at all. That's part of what the file placement rules feature was designed for, in fact!  However, you'd want to use "\*.jpg" and the like (with the preceding backslash". 
    Just remember, that since you have a large collection (you mentioned 50+TBs, IIRC), you may want to get a large SSD as these files add up real quick. 

    Additionally, there is also the SSD Optimizer Balancer. This uses the SSD as a write cache for new files and then moves them off.  May be worth looking into (but you'd want 2x or more SSDs for this ... one drive per copy, so 2x if you have that level of duplication enabled).
     
  5. Yes and no. You can use a different drive, and the Read Striping feature should pull from the significantly faster drive (or the less busy one).  But if you want to use the SSD Optimizer Balancer Plugin, then yes, you would want to do that. 
     
  6. As long as it's not the system drive, or has the page file on it, it's not as important how much free space you have on it. However, I'd recommend 10% or 100GBs of free space.  (Which is actually the default settings for the "Prevent Drive Overfill" balancer in StableBit DrivePool actually).  You can adjust this in the balancer if you want though. 
     
  7. The Drive Usage Limiter limits what sort of data to allow on the drive. Duplicated data, and/or unduplicated data. 
    The slider limits how much data to put on the drives before using other drives, for this balancer.  Otherwise, if there isnt' a suitable drive, it will allow placement. But only in regards to this setting.

    As for the Prevent Drive Overfill, this one is a bit tricky. The first part is the "limit". If it's exceeded, it will trigger a "purge" from the drive. It will purge data from it until it hits the second setting. 
    So if it gets under 10% free or less than 100GBs free, it will try to balance data off of the drive(s) until it's 15% free or 200GBs free.
    And that sounds like it's exactly what you want. 

     
  8. Well, the order that they're in is the priority. So if you want to use the Drive Usage Limiter, I'd put that up top (under the StableBit Scanner balancer).

    Otherwise, the default order should be fine.

     
  9. Well, first, newer drives aren't necessarily less likely to fail.  Look up "WD RED infant mortality rates". Specifically, (at least the first generation of) these drives have a fairly high failure rate for new drives. As in, failing within the first six months of usage.  So, I wouldn't recommend trusting new drives just because they're new. 
    1. If you want to do that, yeah, use the Drive Usage Limiter and uncheck the "Duplicated" option for those drives. That will cause it to place only unduplicated data on these drives.
    2. Nope, we don't support this yet. It is a feature that we do plan on adding in the near future (but I don't have an ETA, sorry).   You could use the File Placement rules to accomplish this though, at least in part.
      As for "random", it's not. New files are placed on the disk with the most available free space (absolute, not percentage).
    3. Noted.  As I said, it's planned. 
      And the SSD Optimizer Balancer Plugin essential does that. New files are written to the "SSD" drives, and then balanced off to the "Archive" drives. Note, you don't have to use SSDs for this. Fast drives, or just "faster" drives work for this, as well.  And since this follows the balancing settings, you can set it to occur once per day, if you want.
       
  10. I'm not entirely sure what you mean here, but I think I do.

    Yes. StableBit DrivePool has two different "free space" measurements reported. One for the entire pool, and one reported for the file system to create files (limited by the free space on the individual disks).  
    The UI can/does show you each disk, but adding info for "free space for new file creation" may be a good idea ... but tricky considering how the drive selection process works.  I'll flag this for Alex (the developer) and let you know.

    As for "tricky", stuff like "real time placement limiters" change this behavior. For instance, the SSD Optimizers use these to ensure that data is only placed on the "SSD" drives. And that would be the "usable free space for new files"  Also, ... using that maybe an easy way to remember. :)

     
  11. Well, we do store the duplication settings on the pool (as hidden NTFS metadata, so it's very hard to find, let alone access). And we store info about the reparse points on the pool.
    As for other settings, the only other one that would really be important here (right now) is the File Placement rules. I've mentioned this and flagged it as a feature request for Alex, in addition to the export/import option. 
    However, we are a very small company, so adding features may not be as fast as we like.  But it's definitely "on the to do/consider" list. 

 

As for CloudDrive, well thank you for the kind words! (well, not towards BitCasa, at least, but towards us).  

Link to comment
Share on other sites

  • 0

Blah BLAH BLAH BLAH BLAH.....!!!!!!

DUDE!  You write a FREAKING TOMB and ask a BILLION questions which the support guy is patient enough to answer... and you can't even be bothered to drop by and read his replies, much less say THANK YOU?  Jeeze!  Some people!

 

In the future, I'd recommend just ignoring ppl who write THIS MUCH!  It's pretty obvious they're more interested in hearing THEMSELVES talk than actually hearing your support answers.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Answer this question...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...