raid6arrays中哪个磁盘坏了

服务器:Ubuntu Lucid
RAID控制器:Adaptec 3805
HP Proliant DL180 G5硬件上的RAID6中有8个磁盘

我的kern.log告诉我在sdb上有一个错误,如下所示:

[2740390.344436] sd 4:0:1:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [2740390.344439] sd 4:0:1:0: [sdb] Sense Key : Hardware Error [current] [2740390.344442] sd 4:0:1:0: [sdb] Add. Sense: Internal target failure [2740390.344447] sd 4:0:1:0: [sdb] CDB: Read(10): 28 00 33 dd dc 00 00 00 08 00 [2740390.344454] end_request: I/O error, dev sdb, sector 870177792 [2774094.573841] sd 4:0:1:0: [sdb] Unhandled sense code [2774094.573847] sd 4:0:1:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [2774094.573851] sd 4:0:1:0: [sdb] Sense Key : Hardware Error [current] [2774094.573856] sd 4:0:1:0: [sdb] Add. Sense: Internal target failure [2774094.573862] sd 4:0:1:0: [sdb] CDB: Read(16): 88 00 00 00 00 01 33 dd ef e8 00 00 01 00 00 00 [2774094.573873] end_request: I/O error, dev sdb, sector 5165150184 [2774094.615437] sd 4:0:1:0: [sdb] Unhandled sense code 

arcconf命令告诉我所有的磁盘状态都在线&条带失败:是的

我怎样才能确定哪个磁盘坏了8磁盘raid6arrays?

修正: 2012年5月2日 – 添加如下:

/ usr / local / sbin / arcconf getconfig 1 AL

 Controllers found: 1 ---------------------------------------------------------------------- Controller information ---------------------------------------------------------------------- Controller Status : Optimal Channel description : SAS/SATA Controller Model : Adaptec 3805 Controller Serial Number : 0C18115C3BB Temperature : 0 C/ 32 F (Normal) Installed memory : 128 MB Copyback : Disabled Background consistency check : Disabled Automatic Failover : Enabled Global task priority : High Stayawake period : Disabled Spinup limit internal drives : 0 Spinup limit external drives : 0 Defunct disk drive count : 0 Logical devices/Failed/Degraded : 2/0/0 NCQ status : Enabled -------------------------------------------------------- Controller Version Information -------------------------------------------------------- BIOS : 5.2-0 (17342) Firmware : 5.2-0 (17342) Driver : 1.1-5 (2461) Boot Flash : 5.2-0 (17342) -------------------------------------------------------- Controller Battery Information -------------------------------------------------------- Status : Optimal Over temperature : No Capacity remaining : 99 percent Time remaining (at current draw) : 3 days, 1 hours, 11 minutes ---------------------------------------------------------------------- Logical device information ---------------------------------------------------------------------- Logical device number 0 Logical device name : boot RAID level : 1 Status of logical device : Optimal Size : 476150 MB Read-cache mode : Enabled Write-cache mode : Enabled (write-back) Write-cache setting : Enabled (write-back) Partitioned : Yes Protected by Hot-Spare : No Bootable : Yes Failed stripes : No Power settings : Disabled -------------------------------------------------------- Logical device segment information -------------------------------------------------------- Segment 0 : Present (0,7) Z2AD1A3H Segment 1 : Present (0,3) Z2AD1834 Logical device number 1 Logical device name : data RAID level : 6 Reed-Solomon Status of logical device : Optimal Size : 2858990 MB Stripe-unit size : 128 KB Read-cache mode : Enabled Write-cache mode : Enabled (write-back) Write-cache setting : Enabled (write-back) Partitioned : Yes Protected by Hot-Spare : No Bootable : No Failed stripes : Yes Power settings : Disabled -------------------------------------------------------- Logical device segment information -------------------------------------------------------- Segment 0 : Present (0,0) 6VPEFSZ0 Segment 1 : Present (0,1) 5VPA5934 Segment 2 : Present (0,2) 5VPA7132 Segment 3 : Present (0,4) 5VPAJ8EJ Segment 4 : Present (0,5) 5VPA6NAZ Segment 5 : Present (0,6) 5VPAJM8Q ---------------------------------------------------------------------- Physical Device information ---------------------------------------------------------------------- Device #0 Device is a Hard drive State : Online Supported : Yes Transfer Speed : SATA 3.0 Gb/s Reported Channel,Device(T:L) : 0,0(0:0) Reported Location : Connector 0, Device 0 Vendor : ST375052 Model : 5AS Firmware : JC4B Serial number : 6VPEFSZ0 Size : 715404 MB Write Cache : Enabled (write-back) FRU : None SMART : No SMART warnings : 0 NCQ status : Enabled Device #1 Device is a Hard drive State : Online Supported : Yes Transfer Speed : SATA 3.0 Gb/s Reported Channel,Device(T:L) : 0,1(1:0) Reported Location : Connector 0, Device 1 Vendor : ST375052 Model : 5AS Firmware : JC4B Serial number : 5VPA5934 Size : 715404 MB Write Cache : Enabled (write-back) FRU : None SMART : No SMART warnings : 0 NCQ status : Enabled Device #2 Device is a Hard drive State : Online Supported : Yes Transfer Speed : SATA 3.0 Gb/s Reported Channel,Device(T:L) : 0,2(2:0) Reported Location : Connector 0, Device 2 Vendor : ST375052 Model : 5AS Firmware : JC4B Serial number : 5VPA7132 Size : 715404 MB Write Cache : Enabled (write-back) FRU : None SMART : No SMART warnings : 0 NCQ status : Enabled Device #3 Device is a Hard drive State : Online Supported : Yes Transfer Speed : SATA 3.0 Gb/s Reported Channel,Device(T:L) : 0,3(3:0) Reported Location : Connector 0, Device 3 Vendor : ST500DM0 Model : 02-1BD142 Firmware : KC44 Serial number : Z2AD1834 Size : 476940 MB Write Cache : Enabled (write-back) FRU : None SMART : No SMART warnings : 0 NCQ status : Enabled Device #4 Device is a Hard drive State : Online Supported : Yes Transfer Speed : SATA 3.0 Gb/s Reported Channel,Device(T:L) : 0,4(4:0) Reported Location : Connector 1, Device 0 Vendor : ST375052 Model : 5AS Firmware : JC4B Serial number : 5VPAJ8EJ Size : 715404 MB Write Cache : Enabled (write-back) FRU : None SMART : No SMART warnings : 0 NCQ status : Enabled Device #5 Device is a Hard drive State : Online Supported : Yes Transfer Speed : SATA 3.0 Gb/s Reported Channel,Device(T:L) : 0,5(5:0) Reported Location : Connector 1, Device 1 Vendor : ST375052 Model : 5AS Firmware : JC4B Serial number : 5VPA6NAZ Size : 715404 MB Write Cache : Enabled (write-back) FRU : None SMART : No SMART warnings : 0 NCQ status : Enabled Device #6 Device is a Hard drive State : Online Supported : Yes Transfer Speed : SATA 3.0 Gb/s Reported Channel,Device(T:L) : 0,6(6:0) Reported Location : Connector 1, Device 2 Vendor : ST375052 Model : 5AS Firmware : JC4B Serial number : 5VPAJM8Q Size : 715404 MB Write Cache : Enabled (write-back) FRU : None SMART : No SMART warnings : 0 NCQ status : Enabled Device #7 Device is a Hard drive State : Online Supported : Yes Transfer Speed : SATA 3.0 Gb/s Reported Channel,Device(T:L) : 0,7(7:0) Reported Location : Connector 1, Device 3 Vendor : ST500DM0 Model : 02-1BD142 Firmware : KC44 Serial number : Z2AD1A3H Size : 476940 MB Write Cache : Enabled (write-back) FRU : None SMART : No SMART warnings : 0 NCQ status : Enabled Command completed successfully. 

更新以下添加的分区信息

 **fdisk -l** Disk /dev/sda: 499.3 GB, 499289948160 bytes 255 heads, 63 sectors/track, 60701 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x0002ab26 Device Boot Start End Blocks Id System /dev/sda1 * 1 59952 481562624 83 Linux /dev/sda2 59953 60702 6022145 5 Extended /dev/sda5 59953 60702 6022144 82 Linux swap / Solaris WARNING: GPT (GUID Partition Table) detected on '/dev/sdb'! The util fdisk doesn't support GPT. Use GNU Parted. Disk /dev/sdb: 2997.9 GB, 2997878784000 bytes 255 heads, 63 sectors/track, 364471 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdb1 1 267350 2147483647+ ee GPT **df -h** Filesystem Size Used Avail Use% Mounted on /dev/sda1 453G 112G 319G 26% / none 1000M 224K 1000M 1% /dev none 1005M 0 1005M 0% /dev/shm none 1005M 664K 1004M 1% /var/run none 1005M 4.0K 1005M 1% /var/lock none 1005M 0 1005M 0% /lib/init/rw /dev/sdb1 2.7T 1.5T 1.1T 58% /media/raid1 /dev/sdb1 2.7T 1.5T 1.1T 58% /media/usbhd-sdb1 /dev/sda1 453G 112G 319G 26% /media/usbhd-sda1 **fstab** # /etc/fstab: static file system information. # # Use 'blkid -o value -s UUID' to print the universally unique identifier # for a device; this may be used with UUID= as a more robust way to name # devices that works even if disks are added and removed. See fstab(5). # # <file system> <mount point> <type> <options> <dump> <pass> proc /proc proc nodev,noexec,nosuid 0 0 # / was on /dev/sda1 during installation UUID=12dd3c31-6dba-4c26-ba81-88a76510bffd / ext4 errors=remount-ro 0 1 # swap was on /dev/sda5 during installation UUID=81618042-ec4e-45e9-947f-9198d29651d3 none swap sw 0 0 UUID=a7832728-5bf9-45c4-8a29-2824b4f2c250 /media/raid1 ext4 errors=remount-ro,noatime 0 1 

除非我错了,否则这些错误告诉你,你的错误没有被RAID控制器纠正。 RAID控制器应该隐藏像你这样的错误。 我不认为你有一个简单的磁盘故障。 我认为你有更严重的事情发生。

假设您的RAID设置中的“启动”卷被识别为sda和“data”as sdb,则系统会告诉您以下信息:

[2740390.344436] sd 4:0:1:0:[sdb]结果:hostbyte = DID_OK driverbyte = DRIVER_SENSE

scsi子系统向低级驱动程序(对于您的adaptec卡)发出了一个无错的命令,并且该卡响应了一个错误(设置了DRIVE_SENSE)。

[2740390.344439] sd 4:0:1:0:[sdb]感知键:硬件错误[当前]

这是错误的types(请参阅ie scsi驱动程序信息 )。

[2740390.344442] sd 4:0:1:0:[sdb] Add。 意义:内部目标失败

这是司机报告的其他信息,而据我所知这个信息意味着“没有具体的信息”/“不知道出了什么问题”。

[2740390.344454] end_request:I / O错误,dev sdb,扇区870177792

错误已经到达块层。

正如另一个回答所说:这不是一个单一的磁盘故障,这是整个RAID的失败。 您应该仔细检查您的数据,并考虑更换RAID子系统或至less控制器。

你应该总是(!)在你的RAID控制器上启用“后台一致性检查”/“被动扫描”/“validation”来find无声的腐败,否则可能会在重build的情况下杀死你的RAID。

你有没有看到任何文件系统错误? 是/ dev / sdb分区/挂载?

这听起来很有趣,但你看过服务器的正面,看看哪个驱动器有一个错误的LED亮起? (假设驱动器有LED)

此外,您可以安装存储pipe理器软件: http : //www.adaptec.com/en-us/downloads/storage_manager/sm/productid=sas-3805&dn=adaptec+raid+3805.html

您可以通过smartctl(CLI)或Adaptec的CLI(如上所述)获取信息,

如果可以重新启动服务器,请从SmartStart DVD中执行。 如果我记得可以从那里访问ACU以获得RAID卷的graphics视图。