Jump to content
Covecube Inc.

Alex

Administrators
  • Content Count

    249
  • Joined

  • Last visited

  • Days Won

    44

Reputation Activity

  1. Like
    Alex got a reaction from JesterEE in Surface scan and SSD   
    Hi Saiyan, I'm the developer.
     
    The Scanner never writes to SSDs while performing a surface scan and therefore does not in any way impact the lifespan of the SSD.
     
    However, SSDs do benefit from full disk surface scans, just like spinning hard drive, in that the surface scan will bring the drive's attention to any latent sectors that may become unreadable in the future. The Scanner's disk surface scan will force your SSD to remap the damaged sectors before the data becomes unreadable.
     
    In short, there is no negative side effect to running the Scanner on SSDs, but there is a positive one.
     
    Please let me know if you need more information.
  2. Thanks
    Alex got a reaction from Spider99 in M.2 Drives - No Smart data - NVME and Sata   
    As per your issue, I've obtained a similar WD M.2 drive and did some testing with it. Starting with build 3193 StableBit Scanner should be able to get SMART data from your M.2 WD SATA drive. I've also added SMART interpretation rules to BitFlock for these drives as well.
    You can get the latest development BETAs here: http://dl.covecube.com/ScannerWindows/beta/download/
    As for Windows Server 2012 R2 and NVMe, currently, NVMe support in the StableBit Scanner requires Windows 10 or Windows Server 2016.
  3. Like
    Alex got a reaction from Jaga in M.2 Drives - No Smart data - NVME and Sata   
    As per your issue, I've obtained a similar WD M.2 drive and did some testing with it. Starting with build 3193 StableBit Scanner should be able to get SMART data from your M.2 WD SATA drive. I've also added SMART interpretation rules to BitFlock for these drives as well.
    You can get the latest development BETAs here: http://dl.covecube.com/ScannerWindows/beta/download/
    As for Windows Server 2012 R2 and NVMe, currently, NVMe support in the StableBit Scanner requires Windows 10 or Windows Server 2016.
  4. Thanks
    Alex got a reaction from Penryn in Ridiculous questions on CloudDrive   
    1c. If a cloud drive fails the checksum / HMAC verification, the error is returned to the OS in the kernel (and eventually the app caller). It's treated exactly the same as a physical drive failing a checksum error. StableBit DrivePool won't automatically pull the data from a duplicate, but you have a few options here.
    First, you will see error notifications in the StableBit CloudDrive UI, and at this point you can either:
    Ignore the the verification failure and pull in the corrupted data to continue using your drive. In StableBit CloudDrive, see Options -> Troubleshooting -> Chunks -> Ignore chunk verification. This may be a good idea if you don't care about that particular file. You can simply delete it and continue normally. Remove the affected drive from the pool and StableBit DrivePool will automatically reduplicated the data onto another known good drive. And thank you for your support, we really appreciate it.
    As far as Issue #27859, I've got a potential fix that looks good. It's currently being qualified for the Microsoft Server 2016 certification, Normally this takes about 12 hours, so a download should be ready by Monday (could even be as soon as tomorrow if all goes well).
  5. Like
    Alex got a reaction from Jaga in Forum Downtime   
    My apologies. We've been having issues with the forum's database for the past couple of days, and as a result the forum has been up and down.
    I believe that the problem is resolved and there should be no further downtime.
  6. Like
    Alex reacted to msq in Yay - B2 provider added :)   
    In the very latest beta 1.1.0.991 the B2 provided has been added
    Downloaded, installed, set up and using already. Thank you guys!
  7. Thanks
    Alex got a reaction from Jaga in check-pool-fileparts   
    If you're not familiar with dpcmd.exe, it's the command line interface to StableBit DrivePool's low level file system and was originally designed for troubleshooting the pool. It's a standalone EXE that's included with every installation of StableBit DrivePool 2.X and is available from the command line.
     
    If you have StableBit DrivePool 2.X installed, go ahead and open up the Command Prompt with administrative access (hold Ctrl + Shift from the Start menu), and type in dpcmd to get some usage information.
     
    Previously, I didn't recommend that people mess with this command because it wasn't really meant for public consumption. But the latest internal build of StableBit DrivePool, 2.2.0.659, includes a completely rewritten dpcmd.exe which now has some more useful functions for more advanced users of StableBit DrivePool, and I'd like to talk about some of these here.
     
    Let's start with the new check-pool-fileparts command.
     
    This command can be used to:
    Check the duplication consistency of every file on the pool and show you any inconsistencies. Report any inconsistencies found to StableBit DrivePool for corrective actions. Generate detailed audit logs, including the exact locations where each file part is stored of each file on the pool. Now let's see how this all works. The new dpcmd.exe includes detailed usage notes and examples for some of the more complicated commands like this one.
     
    To get help on this command type: dpcmd check-pool-fileparts
     
    Here's what you will get:
     
    dpcmd - StableBit DrivePool command line interface Version 2.2.0.659 The command 'check-pool-fileparts' requires at least 1 parameters. Usage: dpcmd check-pool-fileparts [parameter1 [parameter2 ...]] Command: check-pool-fileparts - Checks the file parts stored on the pool for consistency. Parameters: poolPath - A path to a directory or a file on the pool. detailLevel - Detail level to output (0 to 4). (optional) isRecursive - Is this a recursive listing? (TRUE / false) (optional) Detail levels: 0 - Summary 1 - Also show directory duplication status 2 - Also show inconsistent file duplication details, if any (default) 3 - Also show all file duplication details 4 - Also show all file part details Examples: - Perform a duplication check over the entire pool, show any inconsistencies, and inform StableBit DrivePool >dpcmd check-pool-fileparts P:\ - Perform a full duplication check and output all file details to a log file >dpcmd check-pool-fileparts P:\ 3 > Check-Pool-FileParts.log - Perform a full duplication check and just show a summary >dpcmd check-pool-fileparts P:\ 0 - Perform a check on a specific directory and its sub-directories >dpcmd check-pool-fileparts P:\MyFolder - Perform a check on a specific directory and NOT its sub-directories >dpcmd check-pool-fileparts "P:\MyFolder\Specific Folder To Check" 2 false - Perform a check on one specific file >dpcmd check-pool-fileparts "P:\MyFolder\File To Check.exe" The above help text includes some concrete examples on how to use this commands for various scenarios. To perform a basic check of an entire pool and get a summary back, you would simply type:
    dpcmd check-pool-fileparts P:\
     
    This will scan your entire pool and make sure that the correct number of file parts exist for each file. At the end of the scan you will get a summary:
    Scanning...   ! Error: Can't get duplication information for '\\?\p:\System Volume Information\storageconfiguration.xml'. Access is denied   Summary:   Directories: 3,758   Files: 47,507 3.71 TB (4,077,933,565,417   File parts: 48,240 3.83 TB (4,214,331,221,046     * Inconsistent directories: 0   * Inconsistent files: 0   * Missing file parts: 0 0 B (0     ! Error reading directories: 0   ! Error reading files: 1 Any inconsistent files will be reported here, and any scan errors will be as well. For example, in this case I can't scan the System Volume Information folder because as an Administrator, I don't have the proper access to do that (LOCAL SYSTEM does).
     
    Another great use for this command is actually something that has been requested often, and that is the ability to generate audit logs. People want to be absolutely sure that each file on their pool is properly duplicated, and they want to know exactly where it's stored. This is where the maximum detail level of this command comes in handy:
    dpcmd check-pool-fileparts P:\ 4
     
    This will show you how many copies are stored of each file on your pool, and where they're stored.
     
    The output looks something like this:
    Detail level: File Parts   Listing types:     + Directory   - File   -> File part   * Inconsistent duplication   ! Error   Listing format:     [{0}/{1} IM] {2}     {0} - The number of file parts that were found for this file / directory.     {1} - The expected duplication count for this file / directory.     I   - This directory is inheriting its duplication count from its parent.     M   - At least one sub-directory may have a different duplication count.     {2} - The name and size of this file / directory.   ... + [3x/2x] p:\Media -> \Device\HarddiskVolume2\PoolPart.5823dcd3-485d-47bf-8cfa-4bc09ffca40e\Media [Device 0] -> \Device\HarddiskVolume3\PoolPart.6a76681a-3600-4af1-b877-a31815b868c8\Media [Device 0] -> \Device\HarddiskVolume8\PoolPart.d1033a47-69ef-453a-9fb4-337ec00b1451\Media [Device 2] - [2x/2x] p:\Media\commandN Episode 123.mov (80.3 MB - 84,178,119 -> \Device\HarddiskVolume2\PoolPart.5823dcd3-485d-47bf-8cfa-4bc09ffca40e\Media\commandN Episode 123.mov [Device 0] -> \Device\HarddiskVolume8\PoolPart.d1033a47-69ef-453a-9fb4-337ec00b1451\Media\commandN Episode 123.mov [Device 2] - [2x/2x] p:\Media\commandN Episode 124.mov (80.3 MB - 84,178,119 -> \Device\HarddiskVolume2\PoolPart.5823dcd3-485d-47bf-8cfa-4bc09ffca40e\Media\commandN Episode 124.mov [Device 0] -> \Device\HarddiskVolume8\PoolPart.d1033a47-69ef-453a-9fb4-337ec00b1451\Media\commandN Episode 124.mov [Device 2] ... The listing format and listing types are explained at the top, and then for each folder and file on the pool, a record like the above one is generated.
     
    Of course like any command output, it could always be piped into a log file like so:
    dpcmd check-pool-fileparts P:\ 4 > check-pool-fileparts.log
     
    I'm sure with a bit of scripting, people will be able to generate daily audit logs of their pool
     
    Now this is essentially the first version of this command, so if you have an idea on how to improve it, please let us know.
     
    Also, check out set-duplication-recursive. It lets you set the duplication count on multiple folders at once using a file pattern rule (or a regular expression). It's pretty cool.
     
    That's all for now.
  8. Like
    Alex got a reaction from Antoineki in Amazon Cloud Drive - Why is it not supported?   
    Some people have mentioned that we're not transparent enough with regards to what's going on with Amazon Cloud Drive support. That was definitely not the intention, and if that's how you guys feel, then I'm sorry about that and I want to change that. We really have nothing to gain by keeping any of this secret.
     
    Here, you will find a timeline of exactly what's happening with our ongoing communications with the Amazon Cloud Drive team, and I will keep this thread up to date as things develop.
     
    Timeline:
    On May 28th the first public BETA of StableBit CloudDrive was released without Amazon Cloud Drive production access enabled. At the time, I thought that the semi-automated whitelisting process that we went through was "Production Access". While this is similar to other providers, like Dropbox, it became apparent that for Amazon Cloud Drive, it's not. Upon closer reading of their documentation, it appears that the whitelisting process actually imposed "Developer Access" on us. To get upgraded to "Production Access" Amazon Cloud Drive requires direct email communications with the Amazon Cloud Drive team. We submitted our application for approval for production access originally on Aug. 15 over email: On Aug 24th I wrote in requesting a status update, because no one had replied to me, so I had no idea whether the email was read or not. On Aug. 27th I finally got an email back letting me know that our application was approved for production use. I was pleasantly surprised. On Sept 1st, after some testing, I wrote another email to Amazon letting them know that we are having issues with Amazon not respecting the Content-Type of uploads, and that we are also having issues with the new permission scopes that have been changes since we initially submitted our application for approval. No one answered this particular email until... On Sept 7th I received a panicked email from Amazon addressed to me (with CCs to other Amazon addresses) letting me know that Amazon is seeing unusual call patterns from one of our users.  On Sept 11th I replied explaining that we do not keep track of what our customers are doing, and that our software can scale very well, as long as the server permits it and the network bandwidth is sufficient. Our software does respect 429 throttling responses from the server and it does perform exponential backoff, as is standard practice in such cases. Nevertheless, I offered to limit the number of threads that we use, or to apply any other limits that Amazon deems necessary on the client side. I highlighted on this page the key question that really needs answered. Also, please note that obviously Amazon knows who this user is, since they have to be their customer in order to be logged into Amazon Cloud Drive. Also note that instead of banning or throttling that particular customer, Amazon has chosen to block the entire user base of our application.
    On Sept. 16th I received a response from Amazon. On Sept. 21st I haven't heard anything back yet, so I sent them this. I waited until Oct. 29th and no one answered. At that point I informed them that we're going ahead with the next public BETA regardless. Some time in the beginning of November an Amazon employee, on the Amazon forums, started claiming that we're not answering our emails. This was amidst their users asking why Amazon Cloud Drive is not supported with StableBit CloudDrive. So I did post a reply to that, listing a similar timeline. One Nov. 11th I sent another email to them. On Nov 13th Amazon finally replied. I redacted the limits, I don't know if they want those public.
    On Nov. 19th I sent them another email asking to clarify what the limits mean exactly. On Nov 25th Amazon replied with some questions and concerns regarding the number of API calls per second: On Dec. 2nd I replied with the answers and a new build that implemented large chunk support. This was a fairly complicated change to our code designed to minimize the number of calls per second for large file uploads.

    You can read my Nuts & Bolts post on large chunk support here: http://community.covecube.com/index.php?/topic/1622-large-chunks-and-the-io-manager/ On Dec. 10th 2015, Amazon replied: I have no idea what they mean regarding the encrypted data size. AES-CBC is a 1:1 encryption scheme. Some number of bytes go in and the same exact number of bytes come out, encrypted. We do have some minor overhead for the checksum / authentication signature at the end of every 1 MB unit of data, but that's at most 64 bytes per 1 MB when using HMAC-SHA512 (which is 0.006 % overhead). You can easily verify this by creating an encrypted cloud drive of some size, filling it up to capacity, and then checking how much data is used on the server.

    Here's the data for a 5 GB encrypted drive:



    Yes, it's 5 GBs.  
     
      To clarify the error issue here (I'm not 100% certain about this), Amazon doesn't provide a good way to ID these files. We have to try to upload them again, and then grab the error ID to get the actual file ID so we can update the file.  This is inefficient and would be solved with a more robust API that included a search functionality, or a better file list call. So, this is basically "by design" and required currently.  -Christopher     Unfortunately, we haven't pursued this with Amazon recently. This is due to a number of big bugs that we have been following up on.  However, these bugs have lead to a lot of performance, stability and reliability fixes. And a lot of users have reported that these fixes have significantly improved the Amazon Cloud Drive provider.  That is something that is great to hear, as it may help to get the provider to a stable/reliable state.    That said, once we get to a more stable state (after the next public beta build (after 1.0.0.463) or a stable/RC release), we do plan on pursuing this again.     But in the meanwhile, we have held off on this as we want to focus on the entire product rather than a single, problematic provider.  -Christopher     Amazon has "snuck in" additional guidelines that don't bode well for us.  https://developer.amazon.com/public/apis/experience/cloud-drive/content/developer-guide Don’t build apps that encrypt customer data  
    What does this mean for us? We have no idea right now.  Hopefully, this is a guideline and not a hard rule (other apps allow encryption, so that's hopeful, at least). 
     
    But if we don't get re-approved, we'll deal with that when the time comes (though, we will push hard to get approval).
     
    - Christopher (Jan 15. 2017)
     
    If you haven't seen already, we've released a "gold" version of StableBit CloudDrive. Meaning that we have an official release! 
    Unfortunately, because of increasing issues with Amazon Cloud Drive, that appear to be ENTIRELY server side (drive issues due to "429" errors, odd outages, etc), and that we are STILL not approved for production status (despite sending off additional emails a month ago, requesting approval or at least an update), we have dropped support Amazon Cloud Drive. 

    This does not impact existing users, as you will still be able to mount and use your existing drives. However, we have blocked the ability to create new drives for Amazon Cloud Drive.   
     
    This was not a decision that we made lightly, and while we don't regret this decision, we are saddened by it. We would have loved to come to some sort of outcome that included keeping full support for Amazon Cloud Drive. 
    -Christopher (May 17, 2017)
  9. Like
    Alex got a reaction from KiaraEvirm in How to Contribute Test Data   
    I've started putting up disk and controller test data in this forum, as it relates to the StableBit Scanner's ability to gather SMART and Identify data using various disk controllers.
     
    Direct I/O Test
     
    "Direct I/O" is a set of technologies that the StableBit Scanner uses to read data directly from the disk. I collect my test results using an internal Direct I/O testing tool.
     

     
    You can get the latest version here: Download
     
    The tool will probe your disk and controller for various forms of data that the StableBit Scanner uses and will display either a green check mark or a red X to indicate whether the probe was successful. At the bottom, it will list the probing "methods" that were successfully used to probe the controller / disk.
     
    If you're interested in contributing your test data to this forum, then just run the tool and select a disk that is connected to the controller that you want to probe.
     
    Make sure that your computer is not doing anything else while probing. There is a small chance that the probing process will crash your system.
  10. Like
    Alex got a reaction from Minaleque in Large Chunks and the I/O Manager   
    In this post I'm going to talk about the new large storage chunk support in StableBit CloudDrive 1.0.0.421 BETA, why that's important, and how StableBit CloudDrive manages provider I/O overall.
     
    Want to Download it?
     
    Currently, StableBit CloudDrive 1.0.0.421 BETA is an internal development build and like any of our internal builds, if you'd like, you can download it (or most likely a newer build) here:
    http://wiki.covecube.com/Downloads
     
    The I/O Manager
     
    Before I start talking about chunks and what the current change actually means, let's talk a bit about how StableBit CloudDrive handles provider I/O. Well, first let's define what provider I/O actually is. Provider I/O is the combination of all of the read and write request (or download and upload requests) that are serviced by your provider of choice. For example, if your cloud drive is storing data in Amazon S3, provider I/O consists of all of the download and upload requests from and to Amazon S3.
     
    Now it's important to differentiate provider I/O from cloud drive I/O because provider I/O is not really the same thing as cloud drive I/O. That's because all I/O to and from the drive itself is performed directly in the kernel by our cloud drive driver (cloudfs_disk.sys). But as a result of some cloud drive I/O, provider I/O can be generated. For example, this happens when there is an incoming read request to the drive for some data that is not stored in the local cache. In this case, the kernel driver cooperatively coordinates with the StableBit CloudDrive system service in order generate provider I/O and to complete the incoming read request in a timely manner.
     
    All provider I/O is serviced by the I/O Manager, which lives in the StableBit CloudDrive system service.
     
    Particularly, the I/O Manager is responsible for:
    As an optimization, coalescing incoming provider read and write I/O requests into larger requests. Parallelizing all provider read and write I/O requests using multiple threads. Retrying failed provider I/O operations. Error handling and error reporting logic. Chunks
     
    Now that I've described a little bit about the I/O manager in StableBit CloudDrive, let's talk chunks. StableBit CloudDrive doesn't inherently work with any types of chunks. They are simply the format in which data is stored by your provider of choice. They are an implementation that exists completely outside of the I/O manager, and provide some convenient functions that all chunked providers can use.
     
    How do Chunks Work?
     
    When a chunked cloud provider is first initialized, it is asked about its capabilities, such as whether it can perform partial reads, whether the remote server is performing proper multi-threaded transactional synchronization, etc... In other words, the chunking system needs to know how advanced the provider is, and based on those capabilities it will construct a custom chunk I/O processing pipeline for that particular provider.
     
    The chunk I/O pipeline provides automatic services for the provider such as:
    Whole and partial caching of chunks for performance reasons. Performing checksum insertion on write, and checksum verification on read. Read or write (or both) transactional locking for cloud providers that require it (for example, never try to read chunk 458 when chunk 458 is being written to). Translation of I/O that would end up being a partial chunk read / write request into a whole chunk read / write request for providers that require this. This is actually very complicated. If a partial chunk needs to be read, and the provider doesn't support partial reads, the whole chunk is read (and possibly cached) and only the part needed is returned. If a partial chunk needs to be written, and the provider doesn't support partial writes, then the whole chunk is downloaded (or retrieved from the cache), only the part that needs to be written to is updated, and the whole chunk is written back.If while this is happening another partial write request comes in for the same chunk (in parallel, on a different thread), and we're still in the process of reading that whole chunk, then coalesce the [whole read -> partial write -> whole write] into [whole read -> multiple partial writes -> whole write]. This is purely done as an optimization and is also very complicated. And in the future the chunk I/O processing pipeline can be extended to support other services as the need arises. Large Chunk Support
     
    Speaking of extending the chunk I/O pipeline, that's exactly what happened recently with the addition of large chunk support (> 1 MB) for most cloud providers.
     
    Previously, most cloud providers were limited to a maximum chunk size of 1 MB. This limit was in place because:
    Cloud drive read I/O requests, which can't be satisfied by the local cache, would generate provider read I/O that needed to be satisfied fairly quickly. For providers that didn't support partial reads, this meant that the entire chunk needed to be downloaded at all times, no matter how much data was being read. Additionally, if checksumming was enabled (which would be typical), then by necessity, only whole chunks could be read and written. This had some disadvantages, mostly for users with fast broadband connections:
    Writing a lot of data to the provider would generate a lot of upload requests very quickly (one request per 1 MB uploaded). This wasn't optimal because each request would add some overhead. Generating a lot of upload requests very quickly was also an issue for some cloud providers that were limiting their users based on the number of requests per second, rather than the total bandwidth used. Using smaller chunks with a fast broadband connection and a lot of threads would generate a lot of requests per second. Now, with large chunk support (up to 100 MB per chunk in most cases), we don't have those disadvantages.
     
    What was changed to allow for large chunks?
    In order to support the new large chunks a provider has to support partial reads. That's because it's still very necessary to ensure that all cloud drive read I/O is serviced quickly. Support for a new block based checksumming algorithm was introduced into the chunk I/O pipeline. With this new algorithm it's no longer necessary to read or write whole chunks in order to get checksumming support. This was crucial because it is very important to verify that your data in the cloud is not corrupted, and turning off checksumming wasn't a very good option. Are there any disadvantages to large chunks?
    If you're concerned about using the least possible amount of bandwidth (as opposed to using less provider calls per second), it may be advantageous to use smaller chunks. If you know for sure that you will be storing relatively small files (1-2 MB per file or less) and you will only be updating a few files at a time, there may be less overhead when you use smaller chunks. For providers that don't support partial writes (most cloud providers), larger chunks are more optimal if most of your files are > 1 MB, or if you will be updating a lot of smaller files all at the same time. As far as checksumming, the new algorithm completely supersedes the old one and is enabled by default on all new cloud drives. It really has no disadvantages over the old algorithm. Older cloud drives will continue to use the old checksumming algorithm, which really only has the disadvantage of not supporting large chunks. Which providers currently support large chunks?
    I don't want to post a list here because it would get obsolete as we add more providers. When you create a new cloud drive, look under Advanced Settings. If you see a storage chunk size > 1 MB available, then that provider supports large chunks.  
    Going Chunk-less in the Future?
     
    I should mention that StableBit CloudDrive doesn't actually require chunks. If it at all makes sense for a particular provider, it really doesn't have to store its data in chunks. For example, it's entirely possible that StableBit CloudDrive will have a provider that stores its data in a VHDX file. Sure, why not. For that matter, StableBit CloudDrive providers don't even need to store their data in files at all. I can imagine writing providers that can store their data on an email server or a NNTP server (a bit of a stretch, yes, but theoretically possible).
     
    In fact, the only thing that StableBit CloudDrive really needs is some system that can save data and later retrieve it (in a timely manner). In that sense, StableBit CloudDrive can be a general purpose drive virtualization solution.
  11. Like
    Alex got a reaction from Minaleque in How to Contribute Test Data   
    I've started putting up disk and controller test data in this forum, as it relates to the StableBit Scanner's ability to gather SMART and Identify data using various disk controllers.
     
    Direct I/O Test
     
    "Direct I/O" is a set of technologies that the StableBit Scanner uses to read data directly from the disk. I collect my test results using an internal Direct I/O testing tool.
     

     
    You can get the latest version here: Download
     
    The tool will probe your disk and controller for various forms of data that the StableBit Scanner uses and will display either a green check mark or a red X to indicate whether the probe was successful. At the bottom, it will list the probing "methods" that were successfully used to probe the controller / disk.
     
    If you're interested in contributing your test data to this forum, then just run the tool and select a disk that is connected to the controller that you want to probe.
     
    Make sure that your computer is not doing anything else while probing. There is a small chance that the probing process will crash your system.
  12. Like
    Alex got a reaction from Minaleque in SSD Optimizer Balancing Plugin   
    I've just finished coding a new balancing plugin for StableBit DrivePool, it's called the SSD Optimizer. This was actually a feature request, so here you go.
     
    I know that a lot of people use the Archive Optimizer plugin with SSD drives and I would like this plugin to replace the Archive Optimizer for that kind of balancing scenario.
     
    The idea behind the SSD Optimizer (as it was with the Archive Optimizer) is that if you have one or more SSDs in your pool, they can serve as "landing zones" for new files, and those files will later be automatically migrated to your slower spinning drives. Thus your SSDs would serve as a kind of super fast write buffer for the pool.
     
    The new functionality of the SSD Optimizer is that now it's able to fill your Archive disks one at a time, like the Ordered File Placement plugin, but with support for SSD "Feeder" disks.
     
    Check out the attached screenshot of what it looks like
     
    Notes: http://dl.covecube.com/DrivePoolBalancingPlugins/SsdOptimizer/Notes.txt
    Download: http://stablebit.com/DrivePool/Plugins
     
    Edit: Now up on stablebit.com, link updated.

  13. Like
    Alex got a reaction from Minaleque in Forum Downtime   
    Our forum, wiki and blog web sites experienced an issue with the database server that caused those sites to be down for the past 34 hours. The issue has been resolved and everything is back up and running. I'm sorry for the inconvenience.
     
    StableBit.com, the download server, and software activation services were not affected.
  14. Like
    Alex got a reaction from Antoineki in How to Contribute Test Data   
    I've started putting up disk and controller test data in this forum, as it relates to the StableBit Scanner's ability to gather SMART and Identify data using various disk controllers.
     
    Direct I/O Test
     
    "Direct I/O" is a set of technologies that the StableBit Scanner uses to read data directly from the disk. I collect my test results using an internal Direct I/O testing tool.
     

     
    You can get the latest version here: Download
     
    The tool will probe your disk and controller for various forms of data that the StableBit Scanner uses and will display either a green check mark or a red X to indicate whether the probe was successful. At the bottom, it will list the probing "methods" that were successfully used to probe the controller / disk.
     
    If you're interested in contributing your test data to this forum, then just run the tool and select a disk that is connected to the controller that you want to probe.
     
    Make sure that your computer is not doing anything else while probing. There is a small chance that the probing process will crash your system.
  15. Like
    Alex got a reaction from jaynew in Large Chunks and the I/O Manager   
    In this post I'm going to talk about the new large storage chunk support in StableBit CloudDrive 1.0.0.421 BETA, why that's important, and how StableBit CloudDrive manages provider I/O overall.
     
    Want to Download it?
     
    Currently, StableBit CloudDrive 1.0.0.421 BETA is an internal development build and like any of our internal builds, if you'd like, you can download it (or most likely a newer build) here:
    http://wiki.covecube.com/Downloads
     
    The I/O Manager
     
    Before I start talking about chunks and what the current change actually means, let's talk a bit about how StableBit CloudDrive handles provider I/O. Well, first let's define what provider I/O actually is. Provider I/O is the combination of all of the read and write request (or download and upload requests) that are serviced by your provider of choice. For example, if your cloud drive is storing data in Amazon S3, provider I/O consists of all of the download and upload requests from and to Amazon S3.
     
    Now it's important to differentiate provider I/O from cloud drive I/O because provider I/O is not really the same thing as cloud drive I/O. That's because all I/O to and from the drive itself is performed directly in the kernel by our cloud drive driver (cloudfs_disk.sys). But as a result of some cloud drive I/O, provider I/O can be generated. For example, this happens when there is an incoming read request to the drive for some data that is not stored in the local cache. In this case, the kernel driver cooperatively coordinates with the StableBit CloudDrive system service in order generate provider I/O and to complete the incoming read request in a timely manner.
     
    All provider I/O is serviced by the I/O Manager, which lives in the StableBit CloudDrive system service.
     
    Particularly, the I/O Manager is responsible for:
    As an optimization, coalescing incoming provider read and write I/O requests into larger requests. Parallelizing all provider read and write I/O requests using multiple threads. Retrying failed provider I/O operations. Error handling and error reporting logic. Chunks
     
    Now that I've described a little bit about the I/O manager in StableBit CloudDrive, let's talk chunks. StableBit CloudDrive doesn't inherently work with any types of chunks. They are simply the format in which data is stored by your provider of choice. They are an implementation that exists completely outside of the I/O manager, and provide some convenient functions that all chunked providers can use.
     
    How do Chunks Work?
     
    When a chunked cloud provider is first initialized, it is asked about its capabilities, such as whether it can perform partial reads, whether the remote server is performing proper multi-threaded transactional synchronization, etc... In other words, the chunking system needs to know how advanced the provider is, and based on those capabilities it will construct a custom chunk I/O processing pipeline for that particular provider.
     
    The chunk I/O pipeline provides automatic services for the provider such as:
    Whole and partial caching of chunks for performance reasons. Performing checksum insertion on write, and checksum verification on read. Read or write (or both) transactional locking for cloud providers that require it (for example, never try to read chunk 458 when chunk 458 is being written to). Translation of I/O that would end up being a partial chunk read / write request into a whole chunk read / write request for providers that require this. This is actually very complicated. If a partial chunk needs to be read, and the provider doesn't support partial reads, the whole chunk is read (and possibly cached) and only the part needed is returned. If a partial chunk needs to be written, and the provider doesn't support partial writes, then the whole chunk is downloaded (or retrieved from the cache), only the part that needs to be written to is updated, and the whole chunk is written back.If while this is happening another partial write request comes in for the same chunk (in parallel, on a different thread), and we're still in the process of reading that whole chunk, then coalesce the [whole read -> partial write -> whole write] into [whole read -> multiple partial writes -> whole write]. This is purely done as an optimization and is also very complicated. And in the future the chunk I/O processing pipeline can be extended to support other services as the need arises. Large Chunk Support
     
    Speaking of extending the chunk I/O pipeline, that's exactly what happened recently with the addition of large chunk support (> 1 MB) for most cloud providers.
     
    Previously, most cloud providers were limited to a maximum chunk size of 1 MB. This limit was in place because:
    Cloud drive read I/O requests, which can't be satisfied by the local cache, would generate provider read I/O that needed to be satisfied fairly quickly. For providers that didn't support partial reads, this meant that the entire chunk needed to be downloaded at all times, no matter how much data was being read. Additionally, if checksumming was enabled (which would be typical), then by necessity, only whole chunks could be read and written. This had some disadvantages, mostly for users with fast broadband connections:
    Writing a lot of data to the provider would generate a lot of upload requests very quickly (one request per 1 MB uploaded). This wasn't optimal because each request would add some overhead. Generating a lot of upload requests very quickly was also an issue for some cloud providers that were limiting their users based on the number of requests per second, rather than the total bandwidth used. Using smaller chunks with a fast broadband connection and a lot of threads would generate a lot of requests per second. Now, with large chunk support (up to 100 MB per chunk in most cases), we don't have those disadvantages.
     
    What was changed to allow for large chunks?
    In order to support the new large chunks a provider has to support partial reads. That's because it's still very necessary to ensure that all cloud drive read I/O is serviced quickly. Support for a new block based checksumming algorithm was introduced into the chunk I/O pipeline. With this new algorithm it's no longer necessary to read or write whole chunks in order to get checksumming support. This was crucial because it is very important to verify that your data in the cloud is not corrupted, and turning off checksumming wasn't a very good option. Are there any disadvantages to large chunks?
    If you're concerned about using the least possible amount of bandwidth (as opposed to using less provider calls per second), it may be advantageous to use smaller chunks. If you know for sure that you will be storing relatively small files (1-2 MB per file or less) and you will only be updating a few files at a time, there may be less overhead when you use smaller chunks. For providers that don't support partial writes (most cloud providers), larger chunks are more optimal if most of your files are > 1 MB, or if you will be updating a lot of smaller files all at the same time. As far as checksumming, the new algorithm completely supersedes the old one and is enabled by default on all new cloud drives. It really has no disadvantages over the old algorithm. Older cloud drives will continue to use the old checksumming algorithm, which really only has the disadvantage of not supporting large chunks. Which providers currently support large chunks?
    I don't want to post a list here because it would get obsolete as we add more providers. When you create a new cloud drive, look under Advanced Settings. If you see a storage chunk size > 1 MB available, then that provider supports large chunks.  
    Going Chunk-less in the Future?
     
    I should mention that StableBit CloudDrive doesn't actually require chunks. If it at all makes sense for a particular provider, it really doesn't have to store its data in chunks. For example, it's entirely possible that StableBit CloudDrive will have a provider that stores its data in a VHDX file. Sure, why not. For that matter, StableBit CloudDrive providers don't even need to store their data in files at all. I can imagine writing providers that can store their data on an email server or a NNTP server (a bit of a stretch, yes, but theoretically possible).
     
    In fact, the only thing that StableBit CloudDrive really needs is some system that can save data and later retrieve it (in a timely manner). In that sense, StableBit CloudDrive can be a general purpose drive virtualization solution.
  16. Like
    Alex got a reaction from Antoineki in SSD Optimizer Balancing Plugin   
    I've just finished coding a new balancing plugin for StableBit DrivePool, it's called the SSD Optimizer. This was actually a feature request, so here you go.
     
    I know that a lot of people use the Archive Optimizer plugin with SSD drives and I would like this plugin to replace the Archive Optimizer for that kind of balancing scenario.
     
    The idea behind the SSD Optimizer (as it was with the Archive Optimizer) is that if you have one or more SSDs in your pool, they can serve as "landing zones" for new files, and those files will later be automatically migrated to your slower spinning drives. Thus your SSDs would serve as a kind of super fast write buffer for the pool.
     
    The new functionality of the SSD Optimizer is that now it's able to fill your Archive disks one at a time, like the Ordered File Placement plugin, but with support for SSD "Feeder" disks.
     
    Check out the attached screenshot of what it looks like
     
    Notes: http://dl.covecube.com/DrivePoolBalancingPlugins/SsdOptimizer/Notes.txt
    Download: http://stablebit.com/DrivePool/Plugins
     
    Edit: Now up on stablebit.com, link updated.

  17. Like
    Alex got a reaction from Antoineki in Forum Downtime   
    Our forum, wiki and blog web sites experienced an issue with the database server that caused those sites to be down for the past 34 hours. The issue has been resolved and everything is back up and running. I'm sorry for the inconvenience.
     
    StableBit.com, the download server, and software activation services were not affected.
  18. Like
    Alex got a reaction from Antoineki in Large Chunks and the I/O Manager   
    In this post I'm going to talk about the new large storage chunk support in StableBit CloudDrive 1.0.0.421 BETA, why that's important, and how StableBit CloudDrive manages provider I/O overall.
     
    Want to Download it?
     
    Currently, StableBit CloudDrive 1.0.0.421 BETA is an internal development build and like any of our internal builds, if you'd like, you can download it (or most likely a newer build) here:
    http://wiki.covecube.com/Downloads
     
    The I/O Manager
     
    Before I start talking about chunks and what the current change actually means, let's talk a bit about how StableBit CloudDrive handles provider I/O. Well, first let's define what provider I/O actually is. Provider I/O is the combination of all of the read and write request (or download and upload requests) that are serviced by your provider of choice. For example, if your cloud drive is storing data in Amazon S3, provider I/O consists of all of the download and upload requests from and to Amazon S3.
     
    Now it's important to differentiate provider I/O from cloud drive I/O because provider I/O is not really the same thing as cloud drive I/O. That's because all I/O to and from the drive itself is performed directly in the kernel by our cloud drive driver (cloudfs_disk.sys). But as a result of some cloud drive I/O, provider I/O can be generated. For example, this happens when there is an incoming read request to the drive for some data that is not stored in the local cache. In this case, the kernel driver cooperatively coordinates with the StableBit CloudDrive system service in order generate provider I/O and to complete the incoming read request in a timely manner.
     
    All provider I/O is serviced by the I/O Manager, which lives in the StableBit CloudDrive system service.
     
    Particularly, the I/O Manager is responsible for:
    As an optimization, coalescing incoming provider read and write I/O requests into larger requests. Parallelizing all provider read and write I/O requests using multiple threads. Retrying failed provider I/O operations. Error handling and error reporting logic. Chunks
     
    Now that I've described a little bit about the I/O manager in StableBit CloudDrive, let's talk chunks. StableBit CloudDrive doesn't inherently work with any types of chunks. They are simply the format in which data is stored by your provider of choice. They are an implementation that exists completely outside of the I/O manager, and provide some convenient functions that all chunked providers can use.
     
    How do Chunks Work?
     
    When a chunked cloud provider is first initialized, it is asked about its capabilities, such as whether it can perform partial reads, whether the remote server is performing proper multi-threaded transactional synchronization, etc... In other words, the chunking system needs to know how advanced the provider is, and based on those capabilities it will construct a custom chunk I/O processing pipeline for that particular provider.
     
    The chunk I/O pipeline provides automatic services for the provider such as:
    Whole and partial caching of chunks for performance reasons. Performing checksum insertion on write, and checksum verification on read. Read or write (or both) transactional locking for cloud providers that require it (for example, never try to read chunk 458 when chunk 458 is being written to). Translation of I/O that would end up being a partial chunk read / write request into a whole chunk read / write request for providers that require this. This is actually very complicated. If a partial chunk needs to be read, and the provider doesn't support partial reads, the whole chunk is read (and possibly cached) and only the part needed is returned. If a partial chunk needs to be written, and the provider doesn't support partial writes, then the whole chunk is downloaded (or retrieved from the cache), only the part that needs to be written to is updated, and the whole chunk is written back.If while this is happening another partial write request comes in for the same chunk (in parallel, on a different thread), and we're still in the process of reading that whole chunk, then coalesce the [whole read -> partial write -> whole write] into [whole read -> multiple partial writes -> whole write]. This is purely done as an optimization and is also very complicated. And in the future the chunk I/O processing pipeline can be extended to support other services as the need arises. Large Chunk Support
     
    Speaking of extending the chunk I/O pipeline, that's exactly what happened recently with the addition of large chunk support (> 1 MB) for most cloud providers.
     
    Previously, most cloud providers were limited to a maximum chunk size of 1 MB. This limit was in place because:
    Cloud drive read I/O requests, which can't be satisfied by the local cache, would generate provider read I/O that needed to be satisfied fairly quickly. For providers that didn't support partial reads, this meant that the entire chunk needed to be downloaded at all times, no matter how much data was being read. Additionally, if checksumming was enabled (which would be typical), then by necessity, only whole chunks could be read and written. This had some disadvantages, mostly for users with fast broadband connections:
    Writing a lot of data to the provider would generate a lot of upload requests very quickly (one request per 1 MB uploaded). This wasn't optimal because each request would add some overhead. Generating a lot of upload requests very quickly was also an issue for some cloud providers that were limiting their users based on the number of requests per second, rather than the total bandwidth used. Using smaller chunks with a fast broadband connection and a lot of threads would generate a lot of requests per second. Now, with large chunk support (up to 100 MB per chunk in most cases), we don't have those disadvantages.
     
    What was changed to allow for large chunks?
    In order to support the new large chunks a provider has to support partial reads. That's because it's still very necessary to ensure that all cloud drive read I/O is serviced quickly. Support for a new block based checksumming algorithm was introduced into the chunk I/O pipeline. With this new algorithm it's no longer necessary to read or write whole chunks in order to get checksumming support. This was crucial because it is very important to verify that your data in the cloud is not corrupted, and turning off checksumming wasn't a very good option. Are there any disadvantages to large chunks?
    If you're concerned about using the least possible amount of bandwidth (as opposed to using less provider calls per second), it may be advantageous to use smaller chunks. If you know for sure that you will be storing relatively small files (1-2 MB per file or less) and you will only be updating a few files at a time, there may be less overhead when you use smaller chunks. For providers that don't support partial writes (most cloud providers), larger chunks are more optimal if most of your files are > 1 MB, or if you will be updating a lot of smaller files all at the same time. As far as checksumming, the new algorithm completely supersedes the old one and is enabled by default on all new cloud drives. It really has no disadvantages over the old algorithm. Older cloud drives will continue to use the old checksumming algorithm, which really only has the disadvantage of not supporting large chunks. Which providers currently support large chunks?
    I don't want to post a list here because it would get obsolete as we add more providers. When you create a new cloud drive, look under Advanced Settings. If you see a storage chunk size > 1 MB available, then that provider supports large chunks.  
    Going Chunk-less in the Future?
     
    I should mention that StableBit CloudDrive doesn't actually require chunks. If it at all makes sense for a particular provider, it really doesn't have to store its data in chunks. For example, it's entirely possible that StableBit CloudDrive will have a provider that stores its data in a VHDX file. Sure, why not. For that matter, StableBit CloudDrive providers don't even need to store their data in files at all. I can imagine writing providers that can store their data on an email server or a NNTP server (a bit of a stretch, yes, but theoretically possible).
     
    In fact, the only thing that StableBit CloudDrive really needs is some system that can save data and later retrieve it (in a timely manner). In that sense, StableBit CloudDrive can be a general purpose drive virtualization solution.
  19. Like
    Alex got a reaction from Ginoliggime in Amazon Cloud Drive - Why is it not supported?   
    Some people have mentioned that we're not transparent enough with regards to what's going on with Amazon Cloud Drive support. That was definitely not the intention, and if that's how you guys feel, then I'm sorry about that and I want to change that. We really have nothing to gain by keeping any of this secret.
     
    Here, you will find a timeline of exactly what's happening with our ongoing communications with the Amazon Cloud Drive team, and I will keep this thread up to date as things develop.
     
    Timeline:
    On May 28th the first public BETA of StableBit CloudDrive was released without Amazon Cloud Drive production access enabled. At the time, I thought that the semi-automated whitelisting process that we went through was "Production Access". While this is similar to other providers, like Dropbox, it became apparent that for Amazon Cloud Drive, it's not. Upon closer reading of their documentation, it appears that the whitelisting process actually imposed "Developer Access" on us. To get upgraded to "Production Access" Amazon Cloud Drive requires direct email communications with the Amazon Cloud Drive team. We submitted our application for approval for production access originally on Aug. 15 over email: On Aug 24th I wrote in requesting a status update, because no one had replied to me, so I had no idea whether the email was read or not. On Aug. 27th I finally got an email back letting me know that our application was approved for production use. I was pleasantly surprised. On Sept 1st, after some testing, I wrote another email to Amazon letting them know that we are having issues with Amazon not respecting the Content-Type of uploads, and that we are also having issues with the new permission scopes that have been changes since we initially submitted our application for approval. No one answered this particular email until... On Sept 7th I received a panicked email from Amazon addressed to me (with CCs to other Amazon addresses) letting me know that Amazon is seeing unusual call patterns from one of our users.  On Sept 11th I replied explaining that we do not keep track of what our customers are doing, and that our software can scale very well, as long as the server permits it and the network bandwidth is sufficient. Our software does respect 429 throttling responses from the server and it does perform exponential backoff, as is standard practice in such cases. Nevertheless, I offered to limit the number of threads that we use, or to apply any other limits that Amazon deems necessary on the client side. I highlighted on this page the key question that really needs answered. Also, please note that obviously Amazon knows who this user is, since they have to be their customer in order to be logged into Amazon Cloud Drive. Also note that instead of banning or throttling that particular customer, Amazon has chosen to block the entire user base of our application.
    On Sept. 16th I received a response from Amazon. On Sept. 21st I haven't heard anything back yet, so I sent them this. I waited until Oct. 29th and no one answered. At that point I informed them that we're going ahead with the next public BETA regardless. Some time in the beginning of November an Amazon employee, on the Amazon forums, started claiming that we're not answering our emails. This was amidst their users asking why Amazon Cloud Drive is not supported with StableBit CloudDrive. So I did post a reply to that, listing a similar timeline. One Nov. 11th I sent another email to them. On Nov 13th Amazon finally replied. I redacted the limits, I don't know if they want those public.
    On Nov. 19th I sent them another email asking to clarify what the limits mean exactly. On Nov 25th Amazon replied with some questions and concerns regarding the number of API calls per second: On Dec. 2nd I replied with the answers and a new build that implemented large chunk support. This was a fairly complicated change to our code designed to minimize the number of calls per second for large file uploads.

    You can read my Nuts & Bolts post on large chunk support here: http://community.covecube.com/index.php?/topic/1622-large-chunks-and-the-io-manager/ On Dec. 10th 2015, Amazon replied: I have no idea what they mean regarding the encrypted data size. AES-CBC is a 1:1 encryption scheme. Some number of bytes go in and the same exact number of bytes come out, encrypted. We do have some minor overhead for the checksum / authentication signature at the end of every 1 MB unit of data, but that's at most 64 bytes per 1 MB when using HMAC-SHA512 (which is 0.006 % overhead). You can easily verify this by creating an encrypted cloud drive of some size, filling it up to capacity, and then checking how much data is used on the server.

    Here's the data for a 5 GB encrypted drive:



    Yes, it's 5 GBs.  
     
      To clarify the error issue here (I'm not 100% certain about this), Amazon doesn't provide a good way to ID these files. We have to try to upload them again, and then grab the error ID to get the actual file ID so we can update the file.  This is inefficient and would be solved with a more robust API that included a search functionality, or a better file list call. So, this is basically "by design" and required currently.  -Christopher     Unfortunately, we haven't pursued this with Amazon recently. This is due to a number of big bugs that we have been following up on.  However, these bugs have lead to a lot of performance, stability and reliability fixes. And a lot of users have reported that these fixes have significantly improved the Amazon Cloud Drive provider.  That is something that is great to hear, as it may help to get the provider to a stable/reliable state.    That said, once we get to a more stable state (after the next public beta build (after 1.0.0.463) or a stable/RC release), we do plan on pursuing this again.     But in the meanwhile, we have held off on this as we want to focus on the entire product rather than a single, problematic provider.  -Christopher     Amazon has "snuck in" additional guidelines that don't bode well for us.  https://developer.amazon.com/public/apis/experience/cloud-drive/content/developer-guide Don’t build apps that encrypt customer data  
    What does this mean for us? We have no idea right now.  Hopefully, this is a guideline and not a hard rule (other apps allow encryption, so that's hopeful, at least). 
     
    But if we don't get re-approved, we'll deal with that when the time comes (though, we will push hard to get approval).
     
    - Christopher (Jan 15. 2017)
     
    If you haven't seen already, we've released a "gold" version of StableBit CloudDrive. Meaning that we have an official release! 
    Unfortunately, because of increasing issues with Amazon Cloud Drive, that appear to be ENTIRELY server side (drive issues due to "429" errors, odd outages, etc), and that we are STILL not approved for production status (despite sending off additional emails a month ago, requesting approval or at least an update), we have dropped support Amazon Cloud Drive. 

    This does not impact existing users, as you will still be able to mount and use your existing drives. However, we have blocked the ability to create new drives for Amazon Cloud Drive.   
     
    This was not a decision that we made lightly, and while we don't regret this decision, we are saddened by it. We would have loved to come to some sort of outcome that included keeping full support for Amazon Cloud Drive. 
    -Christopher (May 17, 2017)
  20. Like
    Alex got a reaction from Ginoliggime in Forum Downtime   
    Our forum, wiki and blog web sites experienced an issue with the database server that caused those sites to be down for the past 34 hours. The issue has been resolved and everything is back up and running. I'm sorry for the inconvenience.
     
    StableBit.com, the download server, and software activation services were not affected.
  21. Like
    Alex got a reaction from KiaraEvirm in Forum Downtime   
    Our forum, wiki and blog web sites experienced an issue with the database server that caused those sites to be down for the past 34 hours. The issue has been resolved and everything is back up and running. I'm sorry for the inconvenience.
     
    StableBit.com, the download server, and software activation services were not affected.
  22. Like
    Alex got a reaction from KiaraEvirm in Large Chunks and the I/O Manager   
    In this post I'm going to talk about the new large storage chunk support in StableBit CloudDrive 1.0.0.421 BETA, why that's important, and how StableBit CloudDrive manages provider I/O overall.
     
    Want to Download it?
     
    Currently, StableBit CloudDrive 1.0.0.421 BETA is an internal development build and like any of our internal builds, if you'd like, you can download it (or most likely a newer build) here:
    http://wiki.covecube.com/Downloads
     
    The I/O Manager
     
    Before I start talking about chunks and what the current change actually means, let's talk a bit about how StableBit CloudDrive handles provider I/O. Well, first let's define what provider I/O actually is. Provider I/O is the combination of all of the read and write request (or download and upload requests) that are serviced by your provider of choice. For example, if your cloud drive is storing data in Amazon S3, provider I/O consists of all of the download and upload requests from and to Amazon S3.
     
    Now it's important to differentiate provider I/O from cloud drive I/O because provider I/O is not really the same thing as cloud drive I/O. That's because all I/O to and from the drive itself is performed directly in the kernel by our cloud drive driver (cloudfs_disk.sys). But as a result of some cloud drive I/O, provider I/O can be generated. For example, this happens when there is an incoming read request to the drive for some data that is not stored in the local cache. In this case, the kernel driver cooperatively coordinates with the StableBit CloudDrive system service in order generate provider I/O and to complete the incoming read request in a timely manner.
     
    All provider I/O is serviced by the I/O Manager, which lives in the StableBit CloudDrive system service.
     
    Particularly, the I/O Manager is responsible for:
    As an optimization, coalescing incoming provider read and write I/O requests into larger requests. Parallelizing all provider read and write I/O requests using multiple threads. Retrying failed provider I/O operations. Error handling and error reporting logic. Chunks
     
    Now that I've described a little bit about the I/O manager in StableBit CloudDrive, let's talk chunks. StableBit CloudDrive doesn't inherently work with any types of chunks. They are simply the format in which data is stored by your provider of choice. They are an implementation that exists completely outside of the I/O manager, and provide some convenient functions that all chunked providers can use.
     
    How do Chunks Work?
     
    When a chunked cloud provider is first initialized, it is asked about its capabilities, such as whether it can perform partial reads, whether the remote server is performing proper multi-threaded transactional synchronization, etc... In other words, the chunking system needs to know how advanced the provider is, and based on those capabilities it will construct a custom chunk I/O processing pipeline for that particular provider.
     
    The chunk I/O pipeline provides automatic services for the provider such as:
    Whole and partial caching of chunks for performance reasons. Performing checksum insertion on write, and checksum verification on read. Read or write (or both) transactional locking for cloud providers that require it (for example, never try to read chunk 458 when chunk 458 is being written to). Translation of I/O that would end up being a partial chunk read / write request into a whole chunk read / write request for providers that require this. This is actually very complicated. If a partial chunk needs to be read, and the provider doesn't support partial reads, the whole chunk is read (and possibly cached) and only the part needed is returned. If a partial chunk needs to be written, and the provider doesn't support partial writes, then the whole chunk is downloaded (or retrieved from the cache), only the part that needs to be written to is updated, and the whole chunk is written back.If while this is happening another partial write request comes in for the same chunk (in parallel, on a different thread), and we're still in the process of reading that whole chunk, then coalesce the [whole read -> partial write -> whole write] into [whole read -> multiple partial writes -> whole write]. This is purely done as an optimization and is also very complicated. And in the future the chunk I/O processing pipeline can be extended to support other services as the need arises. Large Chunk Support
     
    Speaking of extending the chunk I/O pipeline, that's exactly what happened recently with the addition of large chunk support (> 1 MB) for most cloud providers.
     
    Previously, most cloud providers were limited to a maximum chunk size of 1 MB. This limit was in place because:
    Cloud drive read I/O requests, which can't be satisfied by the local cache, would generate provider read I/O that needed to be satisfied fairly quickly. For providers that didn't support partial reads, this meant that the entire chunk needed to be downloaded at all times, no matter how much data was being read. Additionally, if checksumming was enabled (which would be typical), then by necessity, only whole chunks could be read and written. This had some disadvantages, mostly for users with fast broadband connections:
    Writing a lot of data to the provider would generate a lot of upload requests very quickly (one request per 1 MB uploaded). This wasn't optimal because each request would add some overhead. Generating a lot of upload requests very quickly was also an issue for some cloud providers that were limiting their users based on the number of requests per second, rather than the total bandwidth used. Using smaller chunks with a fast broadband connection and a lot of threads would generate a lot of requests per second. Now, with large chunk support (up to 100 MB per chunk in most cases), we don't have those disadvantages.
     
    What was changed to allow for large chunks?
    In order to support the new large chunks a provider has to support partial reads. That's because it's still very necessary to ensure that all cloud drive read I/O is serviced quickly. Support for a new block based checksumming algorithm was introduced into the chunk I/O pipeline. With this new algorithm it's no longer necessary to read or write whole chunks in order to get checksumming support. This was crucial because it is very important to verify that your data in the cloud is not corrupted, and turning off checksumming wasn't a very good option. Are there any disadvantages to large chunks?
    If you're concerned about using the least possible amount of bandwidth (as opposed to using less provider calls per second), it may be advantageous to use smaller chunks. If you know for sure that you will be storing relatively small files (1-2 MB per file or less) and you will only be updating a few files at a time, there may be less overhead when you use smaller chunks. For providers that don't support partial writes (most cloud providers), larger chunks are more optimal if most of your files are > 1 MB, or if you will be updating a lot of smaller files all at the same time. As far as checksumming, the new algorithm completely supersedes the old one and is enabled by default on all new cloud drives. It really has no disadvantages over the old algorithm. Older cloud drives will continue to use the old checksumming algorithm, which really only has the disadvantage of not supporting large chunks. Which providers currently support large chunks?
    I don't want to post a list here because it would get obsolete as we add more providers. When you create a new cloud drive, look under Advanced Settings. If you see a storage chunk size > 1 MB available, then that provider supports large chunks.  
    Going Chunk-less in the Future?
     
    I should mention that StableBit CloudDrive doesn't actually require chunks. If it at all makes sense for a particular provider, it really doesn't have to store its data in chunks. For example, it's entirely possible that StableBit CloudDrive will have a provider that stores its data in a VHDX file. Sure, why not. For that matter, StableBit CloudDrive providers don't even need to store their data in files at all. I can imagine writing providers that can store their data on an email server or a NNTP server (a bit of a stretch, yes, but theoretically possible).
     
    In fact, the only thing that StableBit CloudDrive really needs is some system that can save data and later retrieve it (in a timely manner). In that sense, StableBit CloudDrive can be a general purpose drive virtualization solution.
  23. Like
    Alex got a reaction from KiaraEvirm in SSD Optimizer Balancing Plugin   
    I've just finished coding a new balancing plugin for StableBit DrivePool, it's called the SSD Optimizer. This was actually a feature request, so here you go.
     
    I know that a lot of people use the Archive Optimizer plugin with SSD drives and I would like this plugin to replace the Archive Optimizer for that kind of balancing scenario.
     
    The idea behind the SSD Optimizer (as it was with the Archive Optimizer) is that if you have one or more SSDs in your pool, they can serve as "landing zones" for new files, and those files will later be automatically migrated to your slower spinning drives. Thus your SSDs would serve as a kind of super fast write buffer for the pool.
     
    The new functionality of the SSD Optimizer is that now it's able to fill your Archive disks one at a time, like the Ordered File Placement plugin, but with support for SSD "Feeder" disks.
     
    Check out the attached screenshot of what it looks like
     
    Notes: http://dl.covecube.com/DrivePoolBalancingPlugins/SsdOptimizer/Notes.txt
    Download: http://stablebit.com/DrivePool/Plugins
     
    Edit: Now up on stablebit.com, link updated.

  24. Like
    Alex got a reaction from KiaraEvirm in Amazon Cloud Drive - Why is it not supported?   
    Some people have mentioned that we're not transparent enough with regards to what's going on with Amazon Cloud Drive support. That was definitely not the intention, and if that's how you guys feel, then I'm sorry about that and I want to change that. We really have nothing to gain by keeping any of this secret.
     
    Here, you will find a timeline of exactly what's happening with our ongoing communications with the Amazon Cloud Drive team, and I will keep this thread up to date as things develop.
     
    Timeline:
    On May 28th the first public BETA of StableBit CloudDrive was released without Amazon Cloud Drive production access enabled. At the time, I thought that the semi-automated whitelisting process that we went through was "Production Access". While this is similar to other providers, like Dropbox, it became apparent that for Amazon Cloud Drive, it's not. Upon closer reading of their documentation, it appears that the whitelisting process actually imposed "Developer Access" on us. To get upgraded to "Production Access" Amazon Cloud Drive requires direct email communications with the Amazon Cloud Drive team. We submitted our application for approval for production access originally on Aug. 15 over email: On Aug 24th I wrote in requesting a status update, because no one had replied to me, so I had no idea whether the email was read or not. On Aug. 27th I finally got an email back letting me know that our application was approved for production use. I was pleasantly surprised. On Sept 1st, after some testing, I wrote another email to Amazon letting them know that we are having issues with Amazon not respecting the Content-Type of uploads, and that we are also having issues with the new permission scopes that have been changes since we initially submitted our application for approval. No one answered this particular email until... On Sept 7th I received a panicked email from Amazon addressed to me (with CCs to other Amazon addresses) letting me know that Amazon is seeing unusual call patterns from one of our users.  On Sept 11th I replied explaining that we do not keep track of what our customers are doing, and that our software can scale very well, as long as the server permits it and the network bandwidth is sufficient. Our software does respect 429 throttling responses from the server and it does perform exponential backoff, as is standard practice in such cases. Nevertheless, I offered to limit the number of threads that we use, or to apply any other limits that Amazon deems necessary on the client side. I highlighted on this page the key question that really needs answered. Also, please note that obviously Amazon knows who this user is, since they have to be their customer in order to be logged into Amazon Cloud Drive. Also note that instead of banning or throttling that particular customer, Amazon has chosen to block the entire user base of our application.
    On Sept. 16th I received a response from Amazon. On Sept. 21st I haven't heard anything back yet, so I sent them this. I waited until Oct. 29th and no one answered. At that point I informed them that we're going ahead with the next public BETA regardless. Some time in the beginning of November an Amazon employee, on the Amazon forums, started claiming that we're not answering our emails. This was amidst their users asking why Amazon Cloud Drive is not supported with StableBit CloudDrive. So I did post a reply to that, listing a similar timeline. One Nov. 11th I sent another email to them. On Nov 13th Amazon finally replied. I redacted the limits, I don't know if they want those public.
    On Nov. 19th I sent them another email asking to clarify what the limits mean exactly. On Nov 25th Amazon replied with some questions and concerns regarding the number of API calls per second: On Dec. 2nd I replied with the answers and a new build that implemented large chunk support. This was a fairly complicated change to our code designed to minimize the number of calls per second for large file uploads.

    You can read my Nuts & Bolts post on large chunk support here: http://community.covecube.com/index.php?/topic/1622-large-chunks-and-the-io-manager/ On Dec. 10th 2015, Amazon replied: I have no idea what they mean regarding the encrypted data size. AES-CBC is a 1:1 encryption scheme. Some number of bytes go in and the same exact number of bytes come out, encrypted. We do have some minor overhead for the checksum / authentication signature at the end of every 1 MB unit of data, but that's at most 64 bytes per 1 MB when using HMAC-SHA512 (which is 0.006 % overhead). You can easily verify this by creating an encrypted cloud drive of some size, filling it up to capacity, and then checking how much data is used on the server.

    Here's the data for a 5 GB encrypted drive:



    Yes, it's 5 GBs.  
     
      To clarify the error issue here (I'm not 100% certain about this), Amazon doesn't provide a good way to ID these files. We have to try to upload them again, and then grab the error ID to get the actual file ID so we can update the file.  This is inefficient and would be solved with a more robust API that included a search functionality, or a better file list call. So, this is basically "by design" and required currently.  -Christopher     Unfortunately, we haven't pursued this with Amazon recently. This is due to a number of big bugs that we have been following up on.  However, these bugs have lead to a lot of performance, stability and reliability fixes. And a lot of users have reported that these fixes have significantly improved the Amazon Cloud Drive provider.  That is something that is great to hear, as it may help to get the provider to a stable/reliable state.    That said, once we get to a more stable state (after the next public beta build (after 1.0.0.463) or a stable/RC release), we do plan on pursuing this again.     But in the meanwhile, we have held off on this as we want to focus on the entire product rather than a single, problematic provider.  -Christopher     Amazon has "snuck in" additional guidelines that don't bode well for us.  https://developer.amazon.com/public/apis/experience/cloud-drive/content/developer-guide Don’t build apps that encrypt customer data  
    What does this mean for us? We have no idea right now.  Hopefully, this is a guideline and not a hard rule (other apps allow encryption, so that's hopeful, at least). 
     
    But if we don't get re-approved, we'll deal with that when the time comes (though, we will push hard to get approval).
     
    - Christopher (Jan 15. 2017)
     
    If you haven't seen already, we've released a "gold" version of StableBit CloudDrive. Meaning that we have an official release! 
    Unfortunately, because of increasing issues with Amazon Cloud Drive, that appear to be ENTIRELY server side (drive issues due to "429" errors, odd outages, etc), and that we are STILL not approved for production status (despite sending off additional emails a month ago, requesting approval or at least an update), we have dropped support Amazon Cloud Drive. 

    This does not impact existing users, as you will still be able to mount and use your existing drives. However, we have blocked the ability to create new drives for Amazon Cloud Drive.   
     
    This was not a decision that we made lightly, and while we don't regret this decision, we are saddened by it. We would have loved to come to some sort of outcome that included keeping full support for Amazon Cloud Drive. 
    -Christopher (May 17, 2017)
  25. Like
    Alex got a reaction from Ginoliggime in SSD Optimizer Balancing Plugin   
    I've just finished coding a new balancing plugin for StableBit DrivePool, it's called the SSD Optimizer. This was actually a feature request, so here you go.
     
    I know that a lot of people use the Archive Optimizer plugin with SSD drives and I would like this plugin to replace the Archive Optimizer for that kind of balancing scenario.
     
    The idea behind the SSD Optimizer (as it was with the Archive Optimizer) is that if you have one or more SSDs in your pool, they can serve as "landing zones" for new files, and those files will later be automatically migrated to your slower spinning drives. Thus your SSDs would serve as a kind of super fast write buffer for the pool.
     
    The new functionality of the SSD Optimizer is that now it's able to fill your Archive disks one at a time, like the Ordered File Placement plugin, but with support for SSD "Feeder" disks.
     
    Check out the attached screenshot of what it looks like
     
    Notes: http://dl.covecube.com/DrivePoolBalancingPlugins/SsdOptimizer/Notes.txt
    Download: http://stablebit.com/DrivePool/Plugins
     
    Edit: Now up on stablebit.com, link updated.

×
×
  • Create New...