Jump to content
Covecube Inc.

Umfriend

Members
  • Content Count

    931
  • Joined

  • Last visited

  • Days Won

    48

Posts posted by Umfriend

  1. With DP, we don't talk about original and copies. You can have either a single instance of each file, that is x1. If x2, then you have two duplicates. So if you want two copies of each file, x2 is the way to go.

  2. 1 minute ago, Christopher (Drashna) said:

    In theory, that shouldn't be a problem, as the "CoveFs_SynchronizeDirectoryTimes" should be enabled by default, and should keep the directories in sync.  Though, if there are empty folders on one or more disks, it may be possible that it's not syncing the directory time.

    So the thing is I do not know whether this may be relevant to OP. However, the theory is refuted by practice as can be seen in thread I linked. It wasn't just me. And not with empty folders either. Has something be done since January 2020 to address this? If so, then I may be wrong now (but wasn't then).

  3. If the software uses/relies on timestamps of folders as well then this might be the problem:

    Basically, with x2 duplication, a folder may have different date modified on the two disks and any software querying the pool will only get one of them.

  4. Another thing I have is that if I copy files from a client to the Server of the network, it matters whether I access the Server through Explorer -> Network or through a mapped network drive. The latter sometimes fails but I am pretty sure it has to do with some sort of permission (SQL Server backup files I can not copy through the mapped network drive) and I get a different message anyway. So, completely OT.

    So basically, I have no clue. I hope someone else here as an idea on how to diagnose and/or fix. I would have a look at Event Viewer on both the client as the server. Not optimistic but I'd look.

  5. So there is a Pool that consist of only 1 4TB drive? If so, then yes, you can shut down, power down, remove 4TB drive and connect it again when you are done.

    In fact, you could install DP on another machine and connect that 4TB drive and DP should recognize it as a Pool.

  6. Whether it makes sense or not, a higher pool does not inherit duplication settings from a lower Pool. That might make it harder. I even think a higher Pool is rather restricted in what it can read from a lower Pool, I.e. check the duplication status.  But we'll see what Covecube says.

  7. I fear this may be hard to address actually. Whenever the Top Pool (consisting of Pool A and Pool B) starts to rebalance, it would have to take into account:
    (a) The use-gap between Pools A and Pool B
    (b) Then, for any file that is a candidate for move (CFM), check what the Use Of Space (UOS) is, i.e., what the duplication status is (If, for instance, you use folder level duplication, that file may have 1 or a gazillion duplicates),
    (c) Then, for any such CFM, determine what the UOS would be after the move. Again, the relevant duplication may be rather arbitrary.

    The real issue here may be that when checking the UOS, the Top Pool would actually have to read file parameters somehow. Either it would (1) have to read on a look-through basis, so an x2 duplicated file is returned to the balancer process twice or, (2) for each file, interpret the duplication settings as per Pool A/B, including folder level duplication settings. However, I suspect that the balancing process is only able to read the relevant data for itself w.r.t. duplication settings and through the Pools A and B, meaning that querying a file would only return one result (just like you only get one results when you look at Pool A or B through Explorer). I suspect that this is how they end up at "other" currently: The Top Pool queries Pools A and B but receives only one record/data item/hit per duplicated file. It also receives total space. So the difference between total space and space used [by single instances of the file because that is all the Top Pool receives when querying Pool A or B] by definition, is other.

    I am sure it can all be done but it does not seem simple to me and it may have an impact on performance as building the list of files to move will require some additional intelligence.

     

  8. @zeroibis Whatever works for you. To me, it seems like a lot have administrative hassle for a remote, really remote, probability that, assuming you have backups, can be recovered from one way or the other.

    As I said, my setup never requires a 20TB write operation to recover 10TB of data. It basically resembles your 3rd example although I would have Drives 1 to 4 as Pool 1, drives 5 to 8 as Pool 2 and then have Pool 3 consist of Pools 1 and 2. But I agree that if two drives fail, you need no restore unless the two drives are one while 2-drive Pool. In my case, I would need to restore some data if the two failing drives are split between Pool 1 and Pool 2. I wonder whether a triple duplication with one duplicate through CloudDrive would be helpful but I don;t use Clouddrive and it may be expensive at those sizes.

    I am wondering however, what do you use for backup?

  9. First of all, DP does not require drive letters. Some here have over 40 drives in a Pool. For individual access, you can map the drives to a folder.

    On the backup/restore thing. Yes. However, my setup does not require backing up duplicated data. I have two x1 Pools and then one x2 Pool consisting of the two x2 Pools (it is called Hierarchical Pooling). Each x1 Pool has three HDDs. I only need to backup 1 set of three drives. Now with 4 (or 6 in my case) drives, the probability of losing two drives at the same time (or at least loss of the 2nd prior to recovery of the first failure) is really small. I also wonder what the loss of two drives in one of your Pools would do if ever your Pool is larger than 2 drives. I don't know what kind of backup solution you use but I could simply recover the whole shabang to my x2 Pool and tell the recovery not to overwrite existing files in that x2 Pool. DP would have recovered files where one duplicate still exists, the other files will again be present on both sub-Pools.

  10. I don't know the ins- and outs but some of this stands to reason. So you have Pool X (x2, consisting of X1 and X2), Pool Y (x2) and Pool Z (x1) consisting of Pool X and Y. I think the issue is that when DP finds it needs to move 100MB from X to Y, it will select 100MB, measured at x1, of files to move but it must then move both duplicates. DP will not be able to move just one duplicate from X to Y because then the duplication requirement will not be met on either X or Y.

    Why you get a substantial "Other" I am not sure but it may be because from the perspective of Pool Z, the files are in fact unduplicated. On X1 and X2 both you'll have a PoolPart.* folder. Files written there are part of Pool X. Within these folders, you should have another PoolPart.* folder. Files written here are part of Pool Z. My guess is that when you write a file to Pool Z, it will feed through to, say, the inner PoolPart.* folder on X1. Then DP will try to duplicate this file in Pool X (so write to X2) and I think this duplicate may be considered "Other" from the perspective of Pool Z. Not sure where that one ends up physically (inner or outer PoolPart.* folder) on X2 but it can not be part of Pool Z (with respect to measurement) because if it were, it would be a duplicated file. That would be in violation of the duplication setting of Pool Z.

    Generally, I think, it is better to have Pool X and Y both at x1 (so no) duplication and then tell Pool Z to have x2 duplication. One big advantage is that if you ever need to recover, you only need either the drives from Pool X or Pool Y. I am pretty sure that would avoid balancing to overshoot and measurement being unclear.

     

  11. No worries, it was new to me too (and I still don't know that much about it). So a cheap SAS card, like the IBM 1015 (and there are many that are called differently but are the same. I have a Dell Perc H310 SAS controller which, I am pretty sure, is the exact same card.).

    Anyway, the cheaper SAS-card have two ports indeed. However, those are SAS ports. There are cables that allow you to connect up to four SATA drives to just one SAS port. For example: https://www.amazon.com/Cable-Matters-Internal-SFF-8087-Breakout/dp/B012BPLYJC.  If you look at the picture you see one weird connector, that's a SFF-8087, which splits into 4 SATA. Take two of these and you get 8 SATA connectors.

    A few things to be aware of:

    1. You want these SAS cards to have a BIOS for IT or HBA mode, not RAID. It is something you can do youself if you get a 2nd hand card in RAID mode. I found a guy who flashed it for me before shipping.
    2. Whatever card you get, check whether it is indeed using a SFF-8087 connector. No worries if it does not, it just means you need another type of breakout cables.
    3. In my case, the breakout cables included power delivery. That was a pain because the cables became less flexible than I wanted. The picture above is more to my liking, just data cables. Power to the drives seperately
    4. Which of course means you need a way to power 8, 12 or 14 HDDs... There are splitters for this as well. I think you do want to share the load a bit on the various cables that come out of the PSU
    5. Finally, these SAS cards are typically meant for servers that have plenty of airflow. The chip can run hot and most likeley only has a heatsink. You can attach a, I think, 40mm fan to it using philips screws that attach between ribs of the heatsink.
  12. Hi Newbie! ;D,

    DP needs very little, I had it running on an Intel Celeron G530 and could stream 1080p to at least one device. So a cheap build with, say, a Ryzen 3 3200G, 8GB of RAM, a decent 350W PSU and W10 would work like a charm as a file/stream server. The things you'd probably look for are SATA connectors (cheap boards often have only 4). Although you could get a cheap SAS card (IBM 1015 or somesuch, used.) which would provide plenty of expandability. The other thing is the network connection. 1Gb Ethernet, I think, should be "enough for anybody".

    It is a bit of a bad time as CPUs are in high demand relative to production capacity. Was a time, which will come again, when you could have a satsfactory CPU for like US$60.

    Edit and PS: Your English is fine. Just use capitals to start and periods to end a sentence and it'll be great.

  13. My storage needs don't really grow so I am just sticking with what I have, anything between 1.7 and 5.5 yo (6 drives). I bought them spread out over time. Mostly Toshiba and HGST as I had a few issues with Seagate and WD years ago. Only 2 HDDs of same type and purchase date so those are in seperate Pools.

    As long as they don;t fail, I'll run them until they do.

  14. You can have two PoolPart.* folders on one drive at one folder level (e.g. root) but that is a symptom of something gone a bit wrong. Long story. Only one of them would be the real currently active PoolPart.* folder.

    A PoolPart.* folder can in fact contain yet another PoolPart.* folder in the case where you use Hierarchical Pools.

    But mostly, no, normally a drive has one PoolPart.* folder in the root.

    You're most welcome.

  15. Just note that with a 3x10TB + 3x8TB setup, hieracrichal Pools will suffer a deadweight loss. Why? Because you can not create two non-duplicated Pools of the same size. So when you then create a Hierarchical Pool for duplication, it will see one 28TB and one 26TB drive and 2TB will be unusable.

    You don't happen to have a 2TB HDD lying around somewhere, do you?

    Edit: Of, should you have, say, a 6TB HDD, you could do 3x10 and 3x8+1x6TB.

  16. I would just consider to split the 2x2 Pools so that each Pool has one drive on each adapter and perhaps split the 1x5 Pool into two and use Hierarchical Pools for duplication and have each underlying Pools' drives on seperate adapters. Why? Should one adapter fail then you can still access all files without having to connect all HDDs to the other adapter. Sure, you will do so anyway but at least you can now read before you do that.

  17. 3 hours ago, gtaus said:

    As far as data protection, there would only be one file copy on the SSD cache until it flushed out to the archive HDDs. I can live with that, but my system is mainly a home media server for Plex/Kodi. Most of my data is not that essential, so I do not even have duplication set on my pool. With DrivePool, you can set duplication on the folder level, not just on the entire pool. I have a couple folders set for 2X duplication and one folder set for 3X duplication. Almost all of my DrivePool is set to store media files at 1X duplication, because I have my original files backed up on HDDs stored in the closet. If I lost a HDD in the pool, I could rebuild the lost files from my backups.

    Unlike RAID systems where a loss of a HDD can mess up everything on the pool because the files are striped out to various drives, DrivePool writes a complete file to a single HDD. I had a HDD start to fail, DrivePool noticed it, and I was able to offload all but 2 corrupt files on a 5TB HDD. With my RAID systems, a failing HDD easily meant all data loss. So far, DrivePool is just more reliable and easier to recover from a HDD failure than other systems I have used.

    So gtaus did a good elaborate job but there is one thing that is not entirely correct: If you use the SSD Optimzer plugin, where you assign an SSD as a cache / first landing zone, then that will only work for files that are not to be duplicated. If you write a file to a folder that has x2 duplication, DP will write one copy to a HDD right away. If you have little duplicated data then it won't matter much though.

    If you do have duplication and insist that one SSD suffices then there is a way to trick this using Hierarchical Pools. Two copies would be written to the SSD initially. I consider this inadvisable (and probably a bug).

    Overall, I have been using DP since early 2013 (or 2012??). It's a small Pool, more for redundancy/uptime then backup (which is dealt otherwise, Duplication != backup). Never an issue and the fact that it writes files in plain NTFS format is really great for transferring Pools to other PCs, recovery etc. Just not the RAID 10 performance.

×
×
  • Create New...