For the past couple of months since migrating from Drive Bender to Drive Pool, my storage pool has been EXTREMELY stable. Until now.
What happened was on the morning of the 4th of July I discovered that my server had crashed, with nothing in the event logs except the complaint that the machine had shut down unexpectedly.
After bringing it back up, I noticed that all of the custom names I had put into Scanner for each drive had revered back to the drive hardware names.
I noticed that one drive had some unmovable sector/bad read SMART messages, so I forced its removal from the pool and removed it from the JBOD. Things seemed ok after that, but then the JBOD that disk had been in (a non-RAID 8 bay mediasonic, connected at that time via USB3) kept dropping off. I took the drives out of it and put them into an old Sans DIgital 5 bay RAID array (running in JBOD) that I attached to a Mediasonic ESATA card and then the other 3 drives went into a new 4 bay Mediasonic on the same card.
SInce doing that it's been ok until all of a sudden when it's not. It'll run happily for hours and then pool will lock up. If I try to access the mount point Explorer will hang and the Drive Pool UI won't open. I can't restart the DP service. I reboot and then it's fine.
Really the only thing I'm seeing in the event log is:
The IO operation at logical block address 0x21 for Disk 12 (PDO name: \Device\0000004c) was retried.
Disk 12 is the number of the pool mount. In scanner the only errors I see are SMART indicators of exceeding Load Cycle Count. However, when the drive names changed in Scanner I also lost the surface scan/fs scan results for every drive.
I've added a new drive back to the pool to replace the one with the sector errors but duplication hasn't finished yet because the pool hanging and then having to reboot to get it back.
When the pool is hung the system seems mostly normal except if I open Drive Management it'll hang during the drive discovery. I can still access any other non-pool drive.
In terms of server specs:
MB: MSI Krayt Z370
OS: Server 2016 Essentials
JBOD1: Mediasonic 8 bay via MB USB3
JBOD2: Mediasonic 4 bay via MB USB3
JBOD3: Sans Digital 5 bay TR5UT via ESATA to Mediasonic card
JBOD4: Mediasonic 4 bay via ESATA to same card as above
Any ideas? Please let me know what other info I can provide to help get to the bottom of this.
Question
johnnj
Hi,
For the past couple of months since migrating from Drive Bender to Drive Pool, my storage pool has been EXTREMELY stable. Until now.
What happened was on the morning of the 4th of July I discovered that my server had crashed, with nothing in the event logs except the complaint that the machine had shut down unexpectedly.
After bringing it back up, I noticed that all of the custom names I had put into Scanner for each drive had revered back to the drive hardware names.
I noticed that one drive had some unmovable sector/bad read SMART messages, so I forced its removal from the pool and removed it from the JBOD. Things seemed ok after that, but then the JBOD that disk had been in (a non-RAID 8 bay mediasonic, connected at that time via USB3) kept dropping off. I took the drives out of it and put them into an old Sans DIgital 5 bay RAID array (running in JBOD) that I attached to a Mediasonic ESATA card and then the other 3 drives went into a new 4 bay Mediasonic on the same card.
SInce doing that it's been ok until all of a sudden when it's not. It'll run happily for hours and then pool will lock up. If I try to access the mount point Explorer will hang and the Drive Pool UI won't open. I can't restart the DP service. I reboot and then it's fine.
Really the only thing I'm seeing in the event log is:
The IO operation at logical block address 0x21 for Disk 12 (PDO name: \Device\0000004c) was retried.
Disk 12 is the number of the pool mount. In scanner the only errors I see are SMART indicators of exceeding Load Cycle Count. However, when the drive names changed in Scanner I also lost the surface scan/fs scan results for every drive.
I've added a new drive back to the pool to replace the one with the sector errors but duplication hasn't finished yet because the pool hanging and then having to reboot to get it back.
When the pool is hung the system seems mostly normal except if I open Drive Management it'll hang during the drive discovery. I can still access any other non-pool drive.
In terms of server specs:
MB: MSI Krayt Z370
OS: Server 2016 Essentials
JBOD1: Mediasonic 8 bay via MB USB3
JBOD2: Mediasonic 4 bay via MB USB3
JBOD3: Sans Digital 5 bay TR5UT via ESATA to Mediasonic card
JBOD4: Mediasonic 4 bay via ESATA to same card as above
Any ideas? Please let me know what other info I can provide to help get to the bottom of this.
Thanks,
John
Link to comment
Share on other sites
11 answers to this question
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.