一台服务器最近从A地运到B地,这是一次长达六个月的漫长旅程。 由于在发货之前没有任何标签,因此出了问题。 是的,我知道 – 这是别人做的,但是我付出了代价。
我必须挽救数据,需要帮助!
系统启动以前罚款,但现在它将无法启动(甚至没有gr rescue救援 – BIOS没有什么,并试图selectarrays中的每个单独的成员)。
所以我从USB盘上从Debian 7 ISO启动,然后进入救援模式。 到目前为止,收效甚微。
我注意到的第一件事是arrays被降级并重build为备用。 这似乎是因为原来的arrays成员之一失踪。
首先,在救援模式下启动后,arrays处于当前状态的细节:
# mdadm --detail /dev/md127 /dev/md127: Version : 1.2 Creation Time : Wed Nov 7 16:06:02 2012 Raid Level : raid6 Array Size : 26370335232 (25148.71 GiB 27003.22 GB) Used Dev Size : 2930037248 (2794.30 GiB 3000.36 GB) Raid Devices : 11 Total Devices : 11 Persistence : Superblock is persistent Update Time : Sat Sep 13 01:55:51 2014 State : clean, degraded, recovering Active Devices : 10 Working Devices : 11 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 512K Rebuild Status : 45% complete Name : media:0 UUID : b1c40379:914e5d18:dddb893b:4dc5a28f Events : 2216394 Number Major Minor RaidDevice State 0 8 82 0 active sync /dev/sdf2 1 8 97 1 active sync /dev/sdg1 2 8 129 2 active sync /dev/sdi1 3 8 33 3 active sync /dev/sdc1 4 8 161 4 active sync /dev/sdk1 12 8 192 5 spare rebuilding /dev/sdm 6 8 145 6 active sync /dev/sdj1 7 8 49 7 active sync /dev/sdd1 8 8 65 8 active sync /dev/sde1 10 8 224 9 active sync /dev/sdo 11 8 208 10 active sync /dev/sdn
现在让我们尝试安装grub到/ dev / md127。
# grub-install /dev/md127 error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). Segmentation fault
哎呀,那不好。 与这些“find两个索引的磁盘”和“多余的RAID成员”的交易是什么? 那么事实certificate,这些磁盘混在一起,并且系统中安装了一些额外的磁盘,因为不清楚它们是否属于RAID成员。
如果我们尝试安装到单个磁盘会发生什么情况? 看来/ dev / sdc是第一个成员(最低):
# grub-install /dev/sdc error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). /usr/sbin/grub-setup: warn: This GPT partition label has no BIOS Boot Partition; embedding won't be possible!. /usr/sbin/grub-setup: error: embedding is not possible, but this is required for cross-disk install.
好吧,现在我开始变得紧张起来。 我也尝试了其他成员磁盘,如最后一个磁盘sdm:
# grub-install /dev/sdm error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). error: found two disks with the index 9 for RAID md/0. error: found two disks with the index 9 for RAID md/0. error: superfluous RAID member (10 found). error: superfluous RAID member (10 found). /usr/sbin/grub-setup: error: unable to identify a filesystem in hd12; safety check can't be performed.
现在我们得到另一个错误,无法识别文件系统。 文件系统BTW是这个mdadm数组的XFS,并且运行正常(幸好)。
# mdadm --examine /dev/sd? /dev/sda: MBR Magic : aa55 Partition[0] : 15633380 sectors at 13340 (type 0c) /dev/sdb: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : b1c40379:914e5d18:dddb893b:4dc5a28f Name : media:0 Creation Time : Wed Nov 7 16:06:02 2012 Raid Level : raid6 Raid Devices : 10 Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB) Array Size : 23440297984 (22354.41 GiB 24002.87 GB) Used Dev Size : 5860074496 (2794.30 GiB 3000.36 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : 8042b1e3:d9e305aa:f53be8b4:b74cc247 Update Time : Mon Nov 18 18:05:25 2013 Checksum : 5762ae4a - correct Events : 2197822 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 9 Array State : AAAAAAAAAA ('A' == active, '.' == missing) /dev/sdc: MBR Magic : aa55 Partition[0] : 4294967295 sectors at 1 (type ee) /dev/sdd: MBR Magic : aa55 Partition[0] : 4294967295 sectors at 1 (type ee) /dev/sde: MBR Magic : aa55 Partition[0] : 4294967295 sectors at 1 (type ee) /dev/sdf: MBR Magic : aa55 Partition[0] : 4294967295 sectors at 1 (type ee) /dev/sdg: MBR Magic : aa55 Partition[0] : 4294967295 sectors at 1 (type ee) /dev/sdh: MBR Magic : aa55 Partition[0] : 4294967295 sectors at 1 (type ee) /dev/sdi: MBR Magic : aa55 Partition[0] : 4294967295 sectors at 1 (type ee) /dev/sdj: MBR Magic : aa55 Partition[0] : 4294967295 sectors at 1 (type ee) /dev/sdk: MBR Magic : aa55 Partition[0] : 4294967295 sectors at 1 (type ee) mdadm: No md superblock detected on /dev/sdl. /dev/sdm: Magic : a92b4efc Version : 1.2 Feature Map : 0x2 Array UUID : b1c40379:914e5d18:dddb893b:4dc5a28f Name : media:0 Creation Time : Wed Nov 7 16:06:02 2012 Raid Level : raid6 Raid Devices : 11 Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB) Array Size : 26370335232 (25148.71 GiB 27003.22 GB) Used Dev Size : 5860074496 (2794.30 GiB 3000.36 GB) Data Offset : 262144 sectors Super Offset : 8 sectors Recovery Offset : 2563782648 sectors State : clean Device UUID : c436476d:6e6dbc43:de4e9c83:d697fbf7 Update Time : Sat Sep 13 02:03:42 2014 Checksum : db87180b - correct Events : 2216444 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 5 Array State : AAAAAAAAAAA ('A' == active, '.' == missing) /dev/sdn: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : b1c40379:914e5d18:dddb893b:4dc5a28f Name : media:0 Creation Time : Wed Nov 7 16:06:02 2012 Raid Level : raid6 Raid Devices : 11 Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB) Array Size : 26370335232 (25148.71 GiB 27003.22 GB) Used Dev Size : 5860074496 (2794.30 GiB 3000.36 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : 235097a3:7c8a32b8:f1c73a25:9c149239 Update Time : Sat Sep 13 02:03:42 2014 Checksum : d0b20c55 - correct Events : 2216444 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 10 Array State : AAAAAAAAAAA ('A' == active, '.' == missing) /dev/sdo: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : b1c40379:914e5d18:dddb893b:4dc5a28f Name : media:0 Creation Time : Wed Nov 7 16:06:02 2012 Raid Level : raid6 Raid Devices : 11 Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB) Array Size : 26370335232 (25148.71 GiB 27003.22 GB) Used Dev Size : 5860074496 (2794.30 GiB 3000.36 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : f382773b:08814775:542a5a1e:d2515115 Update Time : Sat Sep 13 02:03:42 2014 Checksum : fa85d548 - correct Events : 2216444 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 9 Array State : AAAAAAAAAAA ('A' == active, '.' == missing)
在创build这个stackoverflow请求之前,我运行了上面的mdadm检查命令,发现disk / dev / sdl和disk / dev / sdo都显示“active device 9”。 但是磁盘/ dev / sdl没有被用于旧的更新时间。 我保存了输出:
/dev/sdl: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : b1c40379:914e5d18:dddb893b:4dc5a28f Name : media:0 Creation Time : Wed Nov 7 16:06:02 2012 Raid Level : raid6 Raid Devices : 10 Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB) Array Size : 23440297984 (22354.41 GiB 24002.87 GB) Used Dev Size : 5860074496 (2794.30 GiB 3000.36 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : 8e3499b6:b3baae34:af56fde9:f5d7bc87 Update Time : Fri Nov 15 15:52:20 2013 Checksum : 29fed1f5 - correct Events : 2183610 Layout : left-symmetric Chunk Size : 512K Device Role : Active device 9 Array State : AAAAAAAAAA ('A' == active, '.' == missing)
在创build这个请求之前,我发出了一个成功的mdadm –zero-superblock / dev / sdl,并且该磁盘不再显示为“活动设备9”,所以现在在mdadm中只有一个'disk 9' – 检查输出。
但是,grub-install仍然抱怨“find两个索引为9的磁盘”。
我真的可以在这里使用一些帮助,我花了12个小时的谷歌search试图解决这个无济于事。 当然,这个数据没有备份,所以抢救数组是至关重要的。
编辑添加
我注意到第三个“主动设备9”,并清除了这个超级块,消除了两个索引问题,然后我非常仔细地检查,发现一个额外的磁盘是一个旧的成员,并将其清零 – 这样就消除了多余的磁盘。
现在grub-install不会报告这些错误。
但是,现在报告了分段错误。
# grub-install --recheck /dev/md0 Segmentation fault
于是我安装了一个旧的750GB硬盘,这个硬盘从来不是任何arrays的一部分,我直接将它安装在主板上,旁路LSI 9201控制器。 我用parted和删除一切,然后build立一个bios_grub分区。
Model: ATA ST3750330AS (scsi) Disk /dev/sda: 750GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size File system Name Flags 1 1049kB 2097kB 1049kB primary bios_grub
然后,我安装了grub到该设备(sda)并重新启动,在BIOS中select了该设备。 而GRUB则转入救援模式报告“没有这样的磁盘”。
我不知道下一步该怎么做,可以使用一些帮助! 我也想提一下,在重新启动后,Debian救援显示/ dev / md / 0而不是/ dev / md127。
编辑2
仍然在处理这个问题,我打算做的更正分段错误是为了使每个物理磁盘的所有分区都相同。
所以,看起来像这样:
mdadm --manage /dev/md0 --fail /dev/disk mdadm --manage /dev/md0 --remove /dev/disk dd if=/dev/zero of=/dev/disk sgdisk -R /dev/dest /dev/source sgdisk -G /dev/dest mdadm --manage /dev/md0 --add /dev/disk
我正在使用以下分区模式:
# parted /dev/sdc print Model: ATA WDC WD30EFRX-68A (scsi) Disk /dev/sdc: 3001GB Sector size (logical/physical): 512B/4096B Partition Table: gpt Number Start End Size File system Name Flags 1 1049kB 2097kB 1049kB bios_grub bios_grub 2 2097kB 3001GB 3001GB raid raid
这意味着我一次删除每个成员,运行上述过程,然后重新添加并允许它同步一个磁盘。 这是一个大型的RAID 6arrays,每个同步需要将近一天,所以这将是一个漫长的过程。 但是我想让所有事情恢复健康和完美,我认为这是我尝试消除分段错误的最佳select。
如果有人有任何build议,请让我知道。
编辑3
当我更换每个磁盘时,我正在安装grub以确保其成功运行。 下面是我在当前成员磁盘(被replace之前)上得到的两个错误消息,这就是为什么我相信我收到分段错误错误:
# grub-install /dev/sdl /usr/sbin/grub-setup: warn: This GPT partition label has no BIOS Boot Partition; embedding won't be possible!. /usr/sbin/grub-setup: error: embedding is not possible, but this is required for cross-disk install. # grub-install /dev/sdn /usr/sbin/grub-setup: error: unable to identify a filesystem in hd13; safety check can't be performed.
当然这是在数组中的所有旧成员磁盘上重复多次,但它总是这两个错误之一。 但是,在执行了上面列出的步骤以删除它们之后,设置新的分区并将它们重新添加到数组中,grub会正确安装。
现在只是时间问题。 我正在更新这篇文章,希望它能帮助别人,而且在所有磁盘被replace的几天内,我可以报告成功!
编辑4
这些操作是由朋友提出的。 他们没有工作,我仍然需要帮助!

我真的可以使用任何人/每个人的一些帮助来帮助我的GRUB在这个盒子上工作。
任何人有其他的build议和修复?
编辑5
Grub错误报告:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=764798
你应该回答自己的问题,并将其标记为,因为看起来你正在寻求解决scheme。 几点build议:
前一阵子我写了一些关于设置mdadm raid10的方法,这个方法允许从raid中的每个磁盘启动,这可能会提供一些有用的指针:
如何创build一个3或4(或更多)磁盘软件raid10的启动冗余Debian系统?