Jump to content

Sonicmojo

Members
  • Posts

    54
  • Joined

  • Last visited

  • Days Won

    2

Posts posted by Sonicmojo

  1. Yesterday Scanner tells me we have an impeding drive failure so I evacuated the drive and swapped a new one into the pool. But even after forcing a Rebalance (using the Disk Space EQ plugin) - Drivepool has been sitting here for over an hour "building bucket lists" and not moving anything.

    What is the deal with this plugin - does it actually do anything or is it conflicting with the other 5 plugins I have installed - which seemingly never seem to balance anything either?

    I remember a few year back - I would swap in a new drive, add it to the pool and almost immediately Drive Pool would start balancing and moving files to ensure each drive was nicely filled - now it seems no matter what I throw at the problem - it does not want to do anything unless forced?

    Sonic.

  2. 13 hours ago, Christopher (Drashna) said:

    It sounds like these two issues are unrelated.  

    However, if the disk in question also seems to be causing weird behavior in general, it may be simplest to replace it. 

    As for the evacuation, I'm not sure.  It is possible that StableBit Scanner detected unreadable sectors, as that would trigger this behavior. 

     

    As for the "building bucket lists", this is a pre-execution task for balancing.  It can take a while, as it's basically determining where to move all of the files. 

     

    Also, do you have any balancer plugins installed on the system? 

    Chris

    The disk checks out perfectly in Scanner. The only issue presented was this massively excessive "head parking" which looks like was being caused BY Scanner. All other SMART data is fine and I cannot see any evidence of anything else wrong with it.

    Regarding the "evacuation" by Drivepool/Scanner - I did have a number of the built in balancers activated for a long time but have since cut it back to just two:

    Disk Space EQ

    image.png.94ec96df271ae787b6225115dc58cdcb.png

     

    And StableBit Scanner - with a specific focus on ensuring my "unduplicated" files are being targeted - if there's a problem with a drive.

    image.png.5a77f29c5c469e914846ef703dac540f.png

     

    And here are my "balancing" settings:

    image.png.65319373d99ea390bde3bc2eb0ac96a3.png

    What is most concerning to me is that Scanner is NOT honoring the logic of moving "unduplicated" files away for "possible" problematic drives. In this case - here is the drive that exhibited the excessive head parking (But nothing  more) and now Scanner/Drivepool have filled this drive to max with unduped files - leaving me in a nasty state if something does happen here:

    image.png.76426bfead4f87026b2aab6224f2aa53.png

    Ideally - I need to see this drive looking more like the rest of the drives with a balance leaning more to "duplicated" files rather than not duplicated.

    Any ideas with the balancers or balance settings that might make this even out a bit better. Also - this drive is less than a year old - and while I am certainly not saying that problems are not possible - Scanner is not indicating any issues that are obvious. I have also checked the drive extensively with Seagate tools and nothing negative is coming back from that toolset either. As of right now - this drive appears to be as normal as the other three - except for the ridiculous head parking count - which as of now looks to be driven by Scanner itself.

    I have another server - running a pair of smaller 2TB Seagate drives (from the same series as the server above)- which have been up and running for a year and 4 months - and each drive has a load cycle count of 109 and 110 respectively. The 4x4TB drives in the server above have been running for just over a year and each drive has load cycle counts in 490000 range. The one drive that Scanner started reporting SMART issues with had a load cycle count of over 600000 before I shut it off.

    Even Seagate themselves were shocked to see that load cycle number so high - indicating that all of these NAS drives should really never exceed 1000-2000 for the life of the drive. 

    So I am not sure exactly what SCANNER is doing to one set of my drives and not the other - but it's a problem just the same. As I mentioned - my File Server (4x4TB) does not see a lot of action daily and anything it does see should be handled with ease by the OS. It is unfortunate that there is some sort of issue with the Advanced Power Management settings on Scanner for the File Server but even with APM enabled on the other server - there appears to be no issues with the Load Cycle or any other issue.

    Appreciate any additional tips.

    Cheers

    Sonic.

     

     

     

     

  3. 39 minutes ago, bob19 said:

    I was wondering the same thing. It looks like development is still running, you can download new Betas.

    http://dl.covecube.com/DrivePoolWindows/beta/download/

    http://dl.covecube.com/ScannerWindows/beta/download/

     

    That may be true - but it looks like the beta action stopped around Mar 12-15. Would not surprise me if this is COVID related but it would be nice to at least get a one liner status update.

    Sonic

  4. Just wondering what is happening in here lately?

    I have asked a couple questions in the last week to 10 days and usually get an answer after a day or two?

    Is Covecube support no longer coming round or is the actual company not active?

    I realize the world is in a crap state right now but it would be nice to know if anyone is home?

    Sonic.

  5. Updated Data

    I did some reading on the forum here and found this tidbit from a few years back:

    In theory, the "Disk Control" option in StableBit Scanner is capable of doing this as well, and persistent after reboot. 

    To do so:

    1. Right click on the disk in question
    2. Select "Disk Control"
    3. Uncheck "Advanced Power Management" 
    4. Hit "Set". 

    I have enabled this on all my drives to see if the "parking" will settle down

    But then I noticed that DrivePool was acting strange last night as well. It had begun an "Evacuation" of all the files from this specific drive - acting as if there was some sort of data emergency. When I checked into the server around dinner time last night - DP was displaying a message saying 63.5% Building data buckets (Or something similar) - it seems like it was hanging there forever.

    So I hit Reset in Drivepool and let it run the inspection routines etc to confirm the pools and ensure the data was sound (it was). 

    But when I went into server again this morning - I saw that my Balancers had been altered. The "disk space" balancer (at the top of the stack) had been disabled - leaving the Stablebit Scanner balancer next in line. I believe it's parameters took priority which some lead to the emergency evacuation action. I re-enabled the Disk Space Balancer and now ended up[ with this:

    Any ideas what is going on?

    Sonic

     

     

    Disk Balance Off.png

  6. For the first time ever - I received this message from Scanner on a SEAGATE Ironwolf NAS drive that is about a year old:

    ST4000VN008-2DR166 - 1 warnings

    • The head of this hard drive has parked 600,001 times. After 600,000 parking cycles the drive may be in danger of developing problems. Drives normally park their head when they are powered down and activate their head when they are powered back up. Excessive head parking can be caused by overzealous power management settings either in the Operating System or in the hard drive's firmware.

    This drive (and 3 other identical ones) are part of my Windows Server 2016 file server which really sees very little action on a daily basis. What is the story with this message and why would this drive be "parking" Itself so frequently - considering the server is never powered down (only rebooted one per month following standard maintenance).

    I have not had a chance to look into the other 3 drives yet - but this is concerning just the same. Is there a possibility that DrivePool or Scanner are the cause of this message?

    Appreciate any info on this.

    Cheers

    Sonic.

  7. 1 hour ago, Christopher (Drashna) said:

    the "Duplicate data later" option will leave data on the drive. It will leave any duplicated data on that drive, specifically.  you can then wipe the drive, or even copy it back into the pool, if you are so inclined (but it should run a duplication pass and reduplicate data as needed).

     

    This makes complete sense - but how come the actual "Remove drive" process never completed?

    Is it not supposed to go to 100% (while leaving the dupe files on the drive) and conclude in a correct fashion?

    In my case - DP simply stopped doing anything at 94% and sat there for hours and hours and hours. This feels very uncomfortable on many levels - yet I had no choice but to kill the process.

    Luckily - there was no residual damage to my hard stop as all that was left on the drive where dupes. But this experience does not make me feel very trustworthy toward this app if a process cannot wrap up gracefully and correctly - especially when user data is being manipulated.

    S

  8. 2 hours ago, Christopher (Drashna) said:

    Could you enable logging, and then remove the drive? 
    http://wiki.covecube.com/StableBit_DrivePool_2.x_Log_Collection

    Also, as for "stalled out", any unduplicated data will cause it to take much longer. And it still has to check the data.

     

    Well - the drive is gone now so enabling logging will not help.

    And as far as unduplicated data - I used the same process for 4 consecutive drive removals. The first three went like clockwork...the "remove" drive process (whatever that entails) went to 100% and then DP did a consistency check and duplication AFTER the drive was removed completely.

    This last drive "removal" not go to 100% - it simply sat at 94% for like 18 hours. For me there is long and then there is REALLY long. So I eventually got fed up - cancelled the removal and pulled the drive.

    My concern is that this "remove" process did not go to 100%. There was zero file activity on the pools for hours and hours - so if DP was doing something - it should have been communicated.

    Oddly - the only files left on this drive (after I killed it at 94% for 18 hours) - oddly - were just the duplicates. So I do not understand what the right conclusion to this process should be. I am assuming that if I choose to "process duplicates later" the removal process should be successful and go to 100%. Yes? No? In this case it seems like it was set up to sit at 94% forever.

    Something was not right with this removal - the seemingly non-existent communication of the software (telling me exactly nothing for 18 straight hours) - should be looked at.

    S

     

  9. Update.

    After I returned home from work - the drive targeted for removal STILL said 94.1% so I killed the process and restarted the machine. Once it came back up - I assigned this drive it's own drive letter and examined the files in the Poolpart folder - seems all the files in there were the files I had set to duplicate.

    So while that makes a bit of sense (I did tell DP to duplicate later) - I still do not understand why the removal stalled out and would not complete. I could have left this drive in this state for days with no change. Seems like a bug or something here.

    After the drive was forcefully removed - DP went ahead and did a consistency check and then started a duplication run that took another couple of hours. Looks like everything is good now.

    All my drives have been replaced so I will not need to do this exercise again for a while but I would still like to know why this drive was not removed cleanly.

    S

     

  10. A

    Over the past few months I have been slowly upgrading the drives in my Pool. It's a 16TB pool with 4x4TB drives.

    Last night around 7:00pm - I added the last new drive to the pool and targeted the last drive I wanted to replace. I checked Duplicate later and close all open files and let the process begin - as per usual - things moved along nicely and around 9:30 - I closed my RDP session to the server and let DP do it's thing.

    When I checked back in this morning - I see the following on screen (see attached):

    1. I see a ton of open files

    2. The gauge has been at 94.0% like for hours

    3. There is no file activity that I can see.

    The only thing that DP is telling me is that - if I hover my cursor over the "Removing drive ...(94.0%) indicator near the bottom of the screen (where the progress bar is pulsating) I see a tooltip that say "removing pool part"

    What the heck is going on and why is this taking hours upon hours? When I replaced my third drive a few days ago - the removal process completed cleanly, the drive I wanted to remove was 100% empty in about 5 hours and the UI was silent.

    The drive I am removing here shows  949GB of "other" on it and this is causing me concern. This drive should have nothing on it if the removal process is working correctly.

    Would love to know what is going on here and if I should just let it go, stop it or what? I do hear drive activity and drive LED's are flashing - but what is it doing?

    S

    Additional: Here are the last 20 lines of the Service log - anything here that looks suspect?

     

    0:00:33.1: Information: 0 : [FsControl] Set overall pool mode: PoolModeNormal (lastKey=CoveFsPool, pool=c2cea73a-6516-4ed7-906e-864291ed7d8f)
    0:03:22.7: Information: 0 : [Disks] Got MountPoint_Change (volume ID: dd88cf22-51f8-4ba9-a676-d0bb8b20430b)...
    0:03:23.7: Information: 0 : [Disks] Updating disks / volumes...
    0:03:30.1: Information: 0 : [Disks] Got MountPoint_Change (volume ID: dd88cf22-51f8-4ba9-a676-d0bb8b20430b)...
    0:03:31.1: Information: 0 : [Disks] Updating disks / volumes...
    0:03:43.1: Information: 0 : [Disks] Updating disks / volumes...
    0:04:20.6: Information: 0 : [Disks] Got Pack_Arrive (pack ID: 0db29038-5dbc-4cdd-a784-5748d8b2f063)...
    0:04:22.2: Information: 0 : [Disks] Updating disks / volumes...
    0:04:50.5: Information: 0 : [Disks] Got Volume_Arrive (volume ID: f28ef3b5-4c91-44cb-9f35-7c76d4f97ef5, plex ID: 00000000-0000-0000-0000-000000000000, %: 0)...
    0:04:51.6: Information: 0 : [Disks] Updating disks / volumes...
    0:04:57.0: Information: 0 : [Disks] Got MountPoint_Change (volume ID: f28ef3b5-4c91-44cb-9f35-7c76d4f97ef5)...
    0:04:58.0: Information: 0 : [Disks] Updating disks / volumes...
    0:05:16.8: Information: 0 : [PoolPartUpdates] Found new pool part C6C614F2-7102-411D-B9C9-4D2F05B68ABB (isCloudDrive=False, isOtherPool=False)
    0:05:27.3: Information: 0 : [FsControl] Set overall pool mode: PoolModeOverrideAllowCreateDirectories (lastKey=DrivePoolService.Pool.Tasks.RemoveDriveFromPool, pool=c2cea73a-6516-4ed7-906e-864291ed7d8f)
    0:05:27.3: Information: 0 : [FsControl] Set overall pool mode: PoolModeNoReportIncomplete, PoolModeOverrideAllowCreateDirectories (lastKey=DrivePoolService.Pool.Tasks.RemoveDriveFromPool, pool=c2cea73a-6516-4ed7-906e-864291ed7d8f)
    0:05:27.3: Information: 0 : [FsControl] Set overall pool mode: PoolModeNoReportIncomplete, PoolModeOverrideAllowCreateDirectories, PoolModeNoMeasure (lastKey=DrivePoolService.Pool.Tasks.RemoveDriveFromPool, pool=c2cea73a-6516-4ed7-906e-864291ed7d8f)
    0:05:27.3: Information: 0 : [FsControl] Set overall pool mode: PoolModeNoReportIncomplete, PoolModeOverrideAllowCreateDirectories, PoolModeNoMeasure, PoolModeNoReparse (lastKey=DrivePoolService.Pool.Tasks.RemoveDriveFromPool, pool=c2cea73a-6516-4ed7-906e-864291ed7d8f)

     

     

     

    Drivepool Removal.jpg

  11. All,

    After a instant drive failure on a secondary server last week - I am now being proactive with my primary server and replacing old disk in biweekly stages.

    This server has 4x4TB lineup and I have identified disk 3 as being the first one to be removed. The new disk arrived yesterday but - I do not have all folders in this pool duplicated. What are the correct steps to get this drive out of the pool, out of the box and then swap in it's replacement without jeopardizing any files that could be on drive 3 as single files?

    Assuming the Remove function in the UI should empty the drive (?) but I want to be sure before I start making any moves.

    And  I assume Balancing would be my next step after the new drive is in place?

    Appreciate any tips from the field.

    Cheers

    Sonic.

     

  12. Update - I was able to get the new drives in place and DP did exactly as described - it moved everything over to the new drives without a hitch.

    I pulled all the old drives and will move forward with the two new 2TB. Duplicates are now turned on for everything. Taking no chances.

    However - we really need the ability to at least create a text file or something of the files on each drive. I never really gave it much thought until this situation happened - but a guy is truly flailing in the dark here without any ability to see what's on each drive.

    I know I lost some files and have replaced most everything that I believe was there - but it would be nice to have some basic ability to create a file list.

    Would also like to shout out that Scanner did not recognize or alert me to anything regarding my drive failure either. While I was not expecting a 24 hour countdown to the death of a drive - I was hoping Scanner would at least tell me that a drive that was there a minute ago - is now not there. I believe I have everything set that I can for alerts?

    What am I missing or does Scanner not offer this most basic of alerts?

    Sonic.

     

  13. 3 hours ago, Umfriend said:

    I assume you have no duplication. I would, provided I have enough ports:

    1. Physically remove the faulty HDD (you have done this already)
    2. Remove it through the DP UI -> This should stop DP complaining about a missing disk and unlock the Pool.
    3. Add the two new HDD to the Pool
    4. Remove the two old HDDs from the Pool through the UI -> This will move all files to the new HDDs
    5. Remove the two old HDDs physically from the Server
    6. Then see what you can recover from the faulty HDD and copy that back to the Pool.

    I would consider to keep the two performing old HDDs in the Pool and use x2 duplication.

    Update: 

    I pulled the faulty drive and I believe it is completely screwed. I messed with everything I could think last night including opening the drive case - the heads are in the "park" position and while the drive will power up and the platters spin - the heads simply rock back and forth to the disc edge exactly 11 times (beeping each time) and then the drive powers down.

    I do not think there is anything else I can do outside of sending it to the pros - but given what is on this server - that is not really cost effective.

    Next question - given the locked pool state - is there anyway to get a file listing of what is on this missing drive - before I start adding new discs etc? 

    It would be ideal to know what I am chasing here before I get too far into it. The pool on this server was primarily just housing WSUS, WDS and a few other minor things that I should be able to repopulate without too much trouble.

    Cheers!

    Sonic.

     

     

     

     

  14. I awoke today to a weird beeping sound coming from one of my servers. Also heard a disk thrashing away in the case so I immediately remoted into the box and checked Drivepool. Sure enough - a disk is reported missing. Disk Management confirms some is up as well. 

    This server has 3x2TB using Drivepool. I have identified the disk with the problem and judging by the sound the disk it's making - it's not long for this world so I have to move fast.

    I have new disk on hand but want to make sure I deal with the data in the most efficient manner and hopefully get it back.

    First thing I did was power down the server. My plan is to remove the troubled drive and plug it into my workstation and see if I can at least work with it and hopefully copy data off it.

    Assuming I can get the data off - what are the steps to get this data back into the pool AND reduce the pool to just 2x2TB? If I can do that successfully - I want to move all data off the two remaining drives, delete the pool, ditch these old drives and start a new pool with the new 2x2TB disc.

    But I want to proceed carefully and not compromise any data if possible. I know that Drivepool has locked the pool as soon as it see the disc missing

    What are my next steps to handle this properly and hopefully be up and running with new disc?

    Sonic.

     

     

  15. 1 hour ago, hakank22 said:

    Reading this makes me assume scanner and pool licenses are compatible, mine are currently on WHS 2011, with 2016 essentials, right?

    My upgrade worked perfectly - with the disclaimer that "upgrade" in this case meant a complete clean install of Server 2016 to a new system drive. I did however have a critical need to maintain my existing data drives.

    Both Drivepool and Scanner installed flawlessly - and especially for Drivepool - it picked up the pool (once I reattached my existing drives) like nothing had happened - saving me days of time backing up and restoring up to 10TB of data.

    Regarding Essentials 2016 - I would assume you should be good as well.

    Sonic.

  16. Just now, Jaga said:

    How's your controller flash going Sonic?

    For the past few days I have been dealing with some frustrating SMBv1 issues with one of my key apps (Desktop Central) which is a critical part of the existing server (and the new one). I had to get some very odd issues straightened around first as I am planning on moving this DS install from this old box to the new one (which is really the same box). 

    Now that I have solved the issue - I can get back to the hardware fun. But as noted - I am treading very slowly here to ensure I do not compromise my existing pool or render this box "unstartable" but switching the SATA modes etc. I need to bench this box first, remove the existing pool drives and then pop in a spare to get this SAS controller going. That will most likely happen on Saturday.

    I will report back with either success or failure!

    Sonic.

     

  17. 2 hours ago, Umfriend said:

    So I just got my SAS controller (Dell H310 flashed to IT mode). It works but it does not pass SMART to scanner. I have looked but could not find it. Isn't there a setting I should change in Scanner?

    Edit: Never mind, found it. Scanner -> Settings -> Advanced Settings & Troubleshooting -> Cinfiguration Properities -> DirectIO- check Unsafe (which does not sound scary at all!)

    Thanks for this - I am certain I would have been right back here asking about this!

    Cheers

    Sonic.

  18. 1 minute ago, Jaga said:

    This post goes over how to flash that controller in pretty good detail, it may be helpful.  The person who did it used the UEFI shell to accomplish the flash (I just did the same with my 9201-16e - it works well).

    The P20 firmware is stable - the CRC issues they had with the early 20.00.00.00 versions has been fixed, so go ahead and use it if you like.  20.00.07.00 is the latest.  Be sure to use the files in the IT/UEFI folder from the package when inside the UEFI shell.

    I do -not- think you'll have to worry at all about changing to RAID mode to flash from the UEFI shell, so just skip that entirely.  After flashing you should be able to use CTRL-C to get into the BIOS and check/configure it if necessary.  Definitely disconnect the pool drives for the whole process, just to be sure.

    I read that post - but since I cannot see the actual controller before trying the update (CTRL-C does nothing) I cannot get the last 9 digits of the SAS address. I may have to open the box and see if there is a sticker somewhere with the address.

    Sonic

  19. Jaga,

    Much appreciated! Sounds like it should be relatively painless. And thanks for the "configuration" tip - I knew I forgot something.

    I tried to take a look at the SAS controller about an hour ago - and now I have a different problem. I cannot even get into the SAS ROM/utility since it does not display when I boot the box. I found out I need to switch the SATA config in the BIOS to "RAID" instead of ACPI - which now makes me super nervous that I will not be able to get back into the server with the drives connected via normal SATA.

    Is it even possible to somehow switch into "RAID" mode just long enough to examine/flash the firmware and then switch back to ACPI so I can complete a few tasks on the existing server before I ditch it?

    Thoughts?

    Sonic.

     

     

  20. All,

    This week I am finally considering activating the LSI 2308 SAS controller that I have on both of my file servers and wanted to understand how to get this working with Drivepool (and scanner).

    Ever since I have owned my two Supermicro X10SL7-F server boards - I have only used the standard SATA ports for drive usage. Seems to have worked just fine but I would like to move to the SAS controller on these boards for (hopefully) more stability and (hopefully) even a performance boost if possible.

    My understanding is that this controller should ideally be in "IT Mode" in order for it to allow the connected disks to act like JBOD - so I plan to flash the firmware on it later today. Assuming that works fine - I would then install the latest LSI driver during my server install and then move to reconnecting my three drives for Drivepool.

    Can anyone walk me through the basics of getting Drivepool to play nice with the LSI? My big concern lies with my existing pool. In a perfect world - I want to simply disconnect the drives - get the LSI up and running - reconnect the drives to the LSI, install Drivepool on the server and have it "see" the old pool and have it work as if nothing had changed.

    As opposed to backing up all the data to another server, building a whole new pool, transferring the data back in etc and wait who know how long for the pool to be back to normal.

    Appreciate any info from the field on how to do this right.

    Cheers

    Sonic 

  21. Chris,

    Bumping up this old thread as I have finally reached the point of upgrading my server OS. :)

    However - this week I am also considering activating the LSI 2308 SAS controller that I have on both of my file servers and wanted to understand how to get this working with Drivepool (and scanner).

    Ever since I have owned my two Supermicro X10SL7-F server boards - I have only used the standard SATA ports for drive usage. Seems to have worked just fine but I would like to move to the SAS controller on these boards for (hopefully) more stability and (hopefully) even a performance boost if possible.

    My understanding is that this controller should ideally be in "IT Mode" in order for it to allow the connected disks to act like JBOD - so I plan to flash the firmware on it later today. Assuming that works fine - I would then install the latest LSI driver during my server install and then move to reconnecting my three drives for Drivepool.

    Before I tackle anything however - I need to understand the risks of any possible data loss etc with respect to my existing "pool". As you mention above - Drivepool should "recreate" the pool automatically once I reconnect the drives but what if it does not work? My fear is that any monkeying around with the drives may suddenly corrupt the pool and leave me hanging with major problems.

    Are there any real dangers I need to be aware of when switching from the standard ports over to the LSI SAS? 

    I see my new order of things going something like this:

    1. Deactivate the StableBit DrivePool license on existing server build
    2. Shut down the system
    3. Disconnect data drives (but leave them in the box)
    4. Upgrade firmware on LSI and confirm it is in IT mode
    5. Power up
    6. Install Server 2016 (wiping C during install)
    7. Ensure that LSI controller driver is working in Server 2016
    8. Do basic config of new server install and shut down
    9. Reconnect existing data drives
    10. Power up and ensure data drives show in Disk Manager
    11. Create new mountpoints for data drives
    12. Install/Activate the StableBit DrivePool (And Scanner) software for new build
    13. Let it to "automagically" (automatically) recreate the pool on this new build
    14. Continue with rest of server config.

    Thoughts?

    Cheers!

    Sonic

     

×
×
  • Create New...