我最近在QNAP TS-412 NAS上安装了三个新磁盘。
这三个新磁盘应该与已经存在的磁盘组合成一个4磁盘RAID5arrays,所以我开始了迁移过程。
经过多次尝试(每次约24小时),迁移似乎工作,但导致了一个没有响应的NAS。
此时我重置NAS。 一切都从那里走下坡路:
我已经使用mdadm (正在/dev/md4 , /dev/md13和/dev/md9 )成功重build所有QNAP内部RAID1arrays,只留下RAID5arrays; /dev/md0 :
我已经使用这些命令多次尝试了这些命令:
mdadm -w /dev/md0
(必须在从/dev/sda3删除/dev/sda3之后,由NAS以只读方式挂载arrays,不能在RO模式下修改arrays)。
mdadm /dev/md0 --re-add /dev/sda3
之后arrays开始重build。 尽pipe这个系统的速度非常缓慢和/或没有响应,但它的速度却达到了99.9%。 (大多数情况下,使用SSHlogin失败)。
事物的当前状态:
[admin@nas01 ~]# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md4 : active raid1 sdd2[2](S) sdc2[1] sdb2[0] 530048 blocks [2/2] [UU] md0 : active raid5 sda3[4] sdd3[3] sdc3[2] sdb3[1] 8786092608 blocks super 1.0 level 5, 64k chunk, algorithm 2 [4/3] [_UUU] [===================>.] recovery = 99.9% (2928697160/2928697536) finish=0.0min speed=110K/sec md13 : active raid1 sda4[0] sdb4[1] sdd4[3] sdc4[2] 458880 blocks [4/4] [UUUU] bitmap: 0/57 pages [0KB], 4KB chunk md9 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1] 530048 blocks [4/4] [UUUU] bitmap: 2/65 pages [8KB], 4KB chunk unused devices: <none>
(现在停滞在2928697160/2928697536几小时)
[admin@nas01 ~]# mdadm -D /dev/md0 /dev/md0: Version : 01.00.03 Creation Time : Thu Jan 10 23:35:00 2013 Raid Level : raid5 Array Size : 8786092608 (8379.07 GiB 8996.96 GB) Used Dev Size : 2928697536 (2793.02 GiB 2998.99 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Mon Jan 14 09:54:51 2013 State : clean, degraded, recovering Active Devices : 3 Working Devices : 4 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 64K Rebuild Status : 99% complete Name : 3 UUID : 0c43bf7b:282339e8:6c730d6b:98bc3b95 Events : 34111 Number Major Minor RaidDevice State 4 8 3 0 spare rebuilding /dev/sda3 1 8 19 1 active sync /dev/sdb3 2 8 35 2 active sync /dev/sdc3 3 8 51 3 active sync /dev/sdd3
在检查/mnt/HDA_ROOT/.logs/kmsg ,事实certificate实际问题似乎是用/dev/sdb3代替的:
<6>[71052.730000] sd 3:0:0:0: [sdb] Unhandled sense code <6>[71052.730000] sd 3:0:0:0: [sdb] Result: hostbyte=0x00 driverbyte=0x08 <6>[71052.730000] sd 3:0:0:0: [sdb] Sense Key : 0x3 [current] [descriptor] <4>[71052.730000] Descriptor sense data with sense descriptors (in hex): <6>[71052.730000] 72 03 00 00 00 00 00 0c 00 0a 80 00 00 00 00 01 <6>[71052.730000] 5d 3e d9 c8 <6>[71052.730000] sd 3:0:0:0: [sdb] ASC=0x0 ASCQ=0x0 <6>[71052.730000] sd 3:0:0:0: [sdb] CDB: cdb[0]=0x88: 88 00 00 00 00 01 5d 3e d9 c8 00 00 00 c0 00 00 <3>[71052.730000] end_request: I/O error, dev sdb, sector 5859367368 <4>[71052.730000] raid5_end_read_request: 27 callbacks suppressed <4>[71052.730000] raid5:md0: read error not correctable (sector 5857246784 on sdb3). <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5:md0: read error not correctable (sector 5857246792 on sdb3). <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5:md0: read error not correctable (sector 5857246800 on sdb3). <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5:md0: read error not correctable (sector 5857246808 on sdb3). <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5:md0: read error not correctable (sector 5857246816 on sdb3). <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5:md0: read error not correctable (sector 5857246824 on sdb3). <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5:md0: read error not correctable (sector 5857246832 on sdb3). <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5:md0: read error not correctable (sector 5857246840 on sdb3). <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5:md0: read error not correctable (sector 5857246848 on sdb3). <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5:md0: read error not correctable (sector 5857246856 on sdb3). <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0. <4>[71052.730000] raid5: some error occurred in a active device:1 of md0.
对于585724XXXX范围内的各个(随机)扇区,上述顺序以稳定的速率重复。
我的问题是:
md0_raid5和md0_resync进程仍在运行)。 sdb3错误。 sdb3上的麻烦部门,但保持完整的数据?) 在完成之前,它可能会拖延,因为它要求有故障的磁盘返回某种状态,但是没有得到它。
无论如何,所有的数据都是(或应该是)完整的,只有3个4磁盘。
你说它从arrayspopup错误的磁盘 – 所以它应该仍然在运行,尽pipe在降级模式。
你可以挂载吗?
您可以通过执行以下操作强制数组运行:
mdadm -D /dev/md0 mdadm --stop /dev/md0 只要以下步骤完全安全:
最后一个标记将防止重build并跳过任何完整性testing。
然后,您应该能够挂载并恢复您的数据。
显而易见的方法是更换出现故障的磁盘,重新创buildarrays并重播在arrays扩展操作之前执行的备份。
但是既然你看起来没有这个select,那么这将是下一个最好的select:
mdraid来说并不重要。 如果您的sdb3设备出现故障,您可能需要使用ddrescue而不是简单的dd来复制数据。 此外,请查看此博客页面,了解有关如何评估RAID 5arrays多设备故障情况的一些提示。