昨天RAID 1镜像中的一个磁盘连接到Adaptec 5405已经死亡,并被新的磁盘所取代,但重build磁盘arrays约2小时后,磁盘就离线了。 Hoster工作人员更新了控制器上的固件并强制arrays在线返回。 系统启动正常, arcconf getconfig 1显示如下输出:
Controllers found: 1 ---------------------------------------------------------------------- Controller information ---------------------------------------------------------------------- Controller Status : Optimal Channel description : SAS/SATA Controller Model : Adaptec 5405 Controller Serial Number : 1D091194BC6 Physical Slot : 18 Temperature : 85 C/ 185 F (Normal) Installed memory : 256 MB Copyback : Disabled Background consistency check : Disabled Automatic Failover : Enabled Global task priority : High Performance Mode : Default/Dynamic Stayawake period : Disabled Spinup limit internal drives : 0 Spinup limit external drives : 0 Defunct disk drive count : 0 Logical devices/Failed/Degraded : 1/0/1 SSDs assigned to MaxCache pool : 0 Maximum SSDs allowed in MaxCache pool : 8 MaxCache Read Cache Pool Size : 0.000 GB MaxCache flush and fetch rate : 0 MaxCache Read, Write Balance Factor : 3,1 NCQ status : Enabled Statistics data collection mode : Enabled -------------------------------------------------------- Controller Version Information -------------------------------------------------------- BIOS : 5.2-0 (18948) Firmware : 5.2-0 (18948) Driver : 1.1-7 (28000) Boot Flash : 5.2-0 (18948) -------------------------------------------------------- Controller Battery Information -------------------------------------------------------- Status : Not Installed ---------------------------------------------------------------------- Logical device information ---------------------------------------------------------------------- Logical device number 0 Logical device name : RAID level : 1 Status of logical device : Degraded Size : 953334 MB Read-cache mode : Enabled MaxCache preferred read cache setting : Enabled Write-cache mode : Disabled (write-through) Write-cache setting : Disabled (write-through) Partitioned : Yes Protected by Hot-Spare : No Bootable : Yes Failed stripes : No Power settings : Disabled -------------------------------------------------------- Logical device segment information -------------------------------------------------------- Segment 0 : Present (Controller:1,Connector:0,Device:1) 9VP3AGB1 Segment 1 : Rebuilding (Controller:1,Connector:0,Device:0) Z1D49LKS ---------------------------------------------------------------------- Physical Device information ---------------------------------------------------------------------- Device #0 Device is a Hard drive State : Rebuilding Supported : Yes Transfer Speed : SATA 3.0 Gb/s Reported Channel,Device(T:L) : 0,0(0:0) Reported Location : Connector 0, Device 0 Vendor : Model : ST1000DM003-9YN1 Firmware : CC4H Serial number : Z1D49LKS Size : 953869 MB Write Cache : Enabled (write-back) FRU : None SMART : No SMART warnings : 0 Power State : Full rpm Supported Power States : Full rpm,Powered off,Reduced rpm SSD : No MaxCache Capable : No MaxCache Assigned : No NCQ status : Enabled Device #1 Device is a Hard drive State : Online Supported : Yes Transfer Speed : SATA 3.0 Gb/s Reported Channel,Device(T:L) : 0,1(1:0) Reported Location : Connector 0, Device 1 Vendor : Model : ST31000528AS Firmware : CC38 Serial number : 9VP3AGB1 Size : 953869 MB Write Cache : Enabled (write-back) FRU : None SMART : No SMART warnings : 0 Power State : Full rpm Supported Power States : Full rpm,Powered off SSD : No MaxCache Capable : No MaxCache Assigned : No NCQ status : Enabled
和aacraid-status
-- Controller informations -- -- ID | Model | Status c0 | Adaptec 5405 | Optimal -- Arrays informations -- -- ID | Type | Size | Status | Task | Progress c0u0 | RAID1 | 953G | Degraded | Rebuild | 44% -- Disks informations -- ID | Model | Status There is at least one disk/array in a NOT OPTIMAL state.
smartctl的驱动器#1显示它处于故障前状态:
SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 114 099 006 Pre-fail Always - 69962026 3 Spin_Up_Time 0x0003 095 094 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 93 5 Reallocated_Sector_Ct 0x0033 002 002 036 Pre-fail Always FAILING_NOW 4015 7 Seek_Error_Rate 0x000f 083 060 030 Pre-fail Always - 222073391 9 Power_On_Hours 0x0032 073 073 000 Old_age Always - 24485 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 72 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 006 006 000 Old_age Always - 94 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 1507556589923 189 High_Fly_Writes 0x003a 095 095 000 Old_age Always - 5 190 Airflow_Temperature_Cel 0x0022 059 051 045 Old_age Always - 41 (Min/Max 40/41) 194 Temperature_Celsius 0x0022 041 049 000 Old_age Always - 41 (0 19 0 0) 195 Hardware_ECC_Recovered 0x001a 045 028 000 Old_age Always - 69962026 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 202082506249469 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 1020514404 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 590957029
arrays又无法重build并离线。
现在让这个系统恢复在线状态的最好方法是什么?
根据智能驱动器可能处于失败状态,尝试再次重build,如果重新分配扇区数增长 – 这肯定是不好的。
ST1000DM003不支持驱动器 – 请参阅兼容性报告 ,根据我的经验,这些驱动器也有一些固件/兼容性问题。
在全球范围内,从兼容性的angular度来看,Adaptec 5系列存在很多问题,在某些情况下,解决方法是在没有底板的情况下直接连接驱动器,有时在驱动器切换到1.5 gbps(驱动器跳线)时停止。
使用兼容性列表中的驱动器,不要忘记升级驱动器固件。
ps你已经在驱动器上启用写caching,但在控制器上禁用。