我遇到了Adaptec 5805 RAID卡的问题
http://www.adaptec.com/en-us/support/raid/sas_raid/sas-5805/
(RAID中有两张SAS光盘)和技嘉主板GA-H67A-D3H-B3
http://www.gigabyte.com/products/product-page.aspx?pid=3866#sp
运行CENTOS 6作为networking服务器。
小故事:当我启动服务器时,RAID卡全速运行,传输速率超过250Mb / s。 在不超过60分钟的时间内,我收到一个IRQ错误,IRQ 16停止,从那时起,卡的传输速率不超过2.5Mb / s(但正在工作)。 我需要修复它,所以卡一直在全速运行。
很长的故事 :
1]主板没有PCIe x8插槽来安装RAID卡。 我试过了x16插槽,但是在这个插槽里,根本没有检测到卡,系统没有这个插槽。 所以我用了x4插槽,这张卡(令人惊讶的是,对我来说)效果很好。 除了IRQ …
2]有两个SATA磁盘连接到主板上,每个主板都在主板上
三星HD502HJ三星HD103UJ
那么在正常的PCI插槽的第一个插槽中有附加的网卡(在上面链接的图片中,它是在主板上的“DUAL BOOT”描述旁边的最右边的白色PCI插槽。
RAID卡位于PCIeX4插槽(这三个白色PCI插槽旁边)
没有别的用途,我不使用任何USB设备或其他任何东西,只有两个SATA光盘,两个networking连接器(主板和卡)和RAID卡连接两个SAS光盘
3]系统就像我说的Centos 6
uname -a
Linux 2.6.32-71.29.1.el6.x86_64 #1 SMP Mon Jun 27 19:49:27 BST 2011 x86_64 x86_64 x86_64 GNU/Linux
CPU是
Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
lspci -v
00:00.0 Host bridge: Intel Corporation Sandy Bridge DRAM Controller (rev 09) Flags: bus master, fast devsel, latency 0 Capabilities: [e0] Vendor Specific Information <?> 00:02.0 VGA compatible controller: Intel Corporation Sandy Bridge Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller]) Subsystem: Giga-byte Technology Device d000 Flags: bus master, fast devsel, latency 0, IRQ 10 Memory at fb400000 (64-bit, non-prefetchable) [size=4M] Memory at e0000000 (64-bit, prefetchable) [size=256M] I/O ports at ff00 [size=64] Expansion ROM at <unassigned> [disabled] Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit- Capabilities: [d0] Power Management version 2 Capabilities: [a4] PCI Advanced Features 00:16.0 Communication controller: Intel Corporation Cougar Point HECI Controller #1 (rev 04) Subsystem: Giga-byte Technology Device 1c3a Flags: bus master, fast devsel, latency 0, IRQ 10 Memory at fbfff000 (64-bit, non-prefetchable) [size=16] Capabilities: [50] Power Management version 3 Capabilities: [8c] MSI: Enable- Count=1/1 Maskable- 64bit+ 00:1a.0 USB Controller: Intel Corporation Cougar Point USB Enhanced Host Controller #2 (rev 05) (prog-if 20 [EHCI]) Subsystem: Giga-byte Technology Device 5006 Flags: bus master, medium devsel, latency 0, IRQ 18 Memory at fbffe000 (32-bit, non-prefetchable) [size=1K] Capabilities: [50] Power Management version 2 Capabilities: [58] Debug port: BAR=1 offset=00a0 Capabilities: [98] PCI Advanced Features Kernel driver in use: ehci_hcd 00:1c.0 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 1 (rev b5) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 Memory behind bridge: fb800000-fbbfffff Prefetchable memory behind bridge: 00000000dc000000-00000000dc0fffff Capabilities: [40] Express Root Port (Slot+), MSI 00 Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [90] Subsystem: Giga-byte Technology Device 5001 Capabilities: [a0] Power Management version 2 Kernel driver in use: pcieport Kernel modules: shpchp 00:1c.5 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 6 (rev b5) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=02, subordinate=02, sec-latency=0 I/O behind bridge: 0000d000-0000dfff Prefetchable memory behind bridge: 00000000fbd00000-00000000fbdfffff Capabilities: [40] Express Root Port (Slot+), MSI 00 Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [90] Subsystem: Giga-byte Technology Device 5001 Capabilities: [a0] Power Management version 2 Kernel driver in use: pcieport Kernel modules: shpchp 00:1c.6 PCI bridge: Intel Corporation 82801 PCI Bridge (rev b5) (prog-if 01 [Subtractive decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=03, subordinate=04, sec-latency=0 I/O behind bridge: 0000e000-0000efff Memory behind bridge: fbc00000-fbcfffff Prefetchable memory behind bridge: 00000000dc100000-00000000dc1fffff Capabilities: [40] Express Root Port (Slot+), MSI 00 Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit- Capabilities: [90] Subsystem: Giga-byte Technology Device 5001 Capabilities: [a0] Power Management version 2 00:1c.7 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 8 (rev b5) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=05, subordinate=05, sec-latency=0 Memory behind bridge: fbe00000-fbefffff Capabilities: [40] Express Root Port (Slot+), MSI 00 Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [90] Subsystem: Giga-byte Technology Device 5001 Capabilities: [a0] Power Management version 2 Kernel driver in use: pcieport Kernel modules: shpchp 00:1d.0 USB Controller: Intel Corporation Cougar Point USB Enhanced Host Controller #1 (rev 05) (prog-if 20 [EHCI]) Subsystem: Giga-byte Technology Device 5006 Flags: bus master, medium devsel, latency 0, IRQ 23 Memory at fbffd000 (32-bit, non-prefetchable) [size=1K] Capabilities: [50] Power Management version 2 Capabilities: [58] Debug port: BAR=1 offset=00a0 Capabilities: [98] PCI Advanced Features Kernel driver in use: ehci_hcd 00:1f.0 ISA bridge: Intel Corporation Cougar Point LPC Controller (rev 05) Subsystem: Giga-byte Technology Device 5001 Flags: bus master, medium devsel, latency 0 Capabilities: [e0] Vendor Specific Information <?> Kernel modules: iTCO_wdt 00:1f.2 IDE interface: Intel Corporation Cougar Point 4 port SATA IDE Controller (rev 05) (prog-if 8f [Master SecP SecO PriP PriO]) Subsystem: Giga-byte Technology Device b002 Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 19 I/O ports at fe00 [size=8] I/O ports at fd00 [size=4] I/O ports at fc00 [size=8] I/O ports at fb00 [size=4] I/O ports at fa00 [size=16] I/O ports at f900 [size=16] Capabilities: [70] Power Management version 3 Capabilities: [b0] PCI Advanced Features Kernel driver in use: ata_piix Kernel modules: ata_generic, pata_acpi, ata_piix 00:1f.3 SMBus: Intel Corporation Cougar Point SMBus Controller (rev 05) Subsystem: Giga-byte Technology Device 5001 Flags: medium devsel, IRQ 18 Memory at fbffc000 (64-bit, non-prefetchable) [size=256] I/O ports at 0500 [size=32] Kernel driver in use: i801_smbus Kernel modules: i2c-i801 00:1f.5 IDE interface: Intel Corporation Cougar Point 2 port SATA IDE Controller (rev 05) (prog-if 85 [Master SecO PriO]) Subsystem: Giga-byte Technology Device b002 Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 19 I/O ports at f700 [size=8] I/O ports at f600 [size=4] I/O ports at f500 [size=8] I/O ports at f400 [size=4] I/O ports at f300 [size=16] I/O ports at f200 [size=16] Capabilities: [70] Power Management version 3 Capabilities: [b0] PCI Advanced Features Kernel driver in use: ata_piix Kernel modules: ata_generic, pata_acpi, ata_piix 01:00.0 RAID bus controller: Adaptec AAC-RAID (rev 09) Subsystem: Adaptec ASR5805 Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at fb800000 (64-bit, non-prefetchable) [size=2M] [virtual] Expansion ROM at dc000000 [disabled] [size=512K] Capabilities: [98] Power Management version 2 Capabilities: [a0] MSI: Enable- Count=1/2 Maskable- 64bit+ Capabilities: [d0] Express Endpoint, MSI 00 Capabilities: [90] Vital Product Data Capabilities: [100] Advanced Error Reporting Kernel driver in use: aacraid Kernel modules: aacraid 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06) Subsystem: Giga-byte Technology GA-EP45-DS5 Motherboard Flags: bus master, fast devsel, latency 0, IRQ 32 I/O ports at de00 [size=256] Memory at fbdff000 (64-bit, prefetchable) [size=4K] Memory at fbdf8000 (64-bit, prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [70] Express Endpoint, MSI 01 Capabilities: [b0] MSI-X: Enable- Count=4 Masked- Capabilities: [d0] Vital Product Data Capabilities: [100] Advanced Error Reporting Capabilities: [140] Virtual Channel <?> Capabilities: [160] Device Serial Number 01-00-00-00-68-4c-e0-00 Kernel driver in use: r8169 Kernel modules: r8169 03:00.0 PCI bridge: Integrated Technology Express, Inc. Device 8892 (rev 30) (prog-if 01 [Subtractive decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=03, secondary=04, subordinate=04, sec-latency=32 I/O behind bridge: 0000e000-0000efff Memory behind bridge: fbc00000-fbcfffff Prefetchable memory behind bridge: 00000000dc100000-00000000dc1fffff Capabilities: [90] Power Management version 2 Capabilities: [a0] Subsystem: Giga-byte Technology Device 5000 04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10) Subsystem: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 18 I/O ports at ee00 [size=256] Memory at fbcff000 (32-bit, non-prefetchable) [size=256] [virtual] Expansion ROM at dc100000 [disabled] [size=64K] Capabilities: [dc] Power Management version 2 Kernel driver in use: r8169 Kernel modules: r8169 05:00.0 USB Controller: Device 1b6f:7023 (rev 01) (prog-if 30) Subsystem: Device 1b6f:7023 Flags: bus master, fast devsel, latency 0, IRQ 11 Memory at fbef8000 (64-bit, non-prefetchable) [size=32K] Capabilities: [50] Power Management version 3 Capabilities: [70] MSI: Enable- Count=1/4 Maskable+ 64bit+ Capabilities: [a0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [190] Device Serial Number 01-01-01-01-01-01-01-01
lspci -vv
01:00.0 RAID bus controller: Adaptec AAC-RAID (rev 09) Subsystem: Adaptec ASR5805 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 4 bytes Interrupt: pin A routed to IRQ 16 Region 0: Memory at fb800000 (64-bit, non-prefetchable) [size=2M] [virtual] Expansion ROM at dc000000 [disabled] [size=512K] Capabilities: [98] Power Management version 2 Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [a0] MSI: Enable- Count=1/2 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [d0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 <1us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s, Latency L0 <128ns, L1 unlimited ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [90] Vital Product Data Unknown small resource type 00, will not decode more. Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Kernel driver in use: aacraid Kernel modules: aacraid
猫/ proc /中断
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 128 0 0 0 0 0 0 0 IO-APIC-edge timer 1: 105 0 606 4366 0 0 0 0 IO-APIC-edge i8042 8: 1 0 0 0 0 0 0 0 IO-APIC-edge rtc0 9: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi acpi 16: 1381 0 197881 730 0 0 0 9 IO-APIC-fasteoi aacraid 18: 1695 0 0 0 13372 60347990 0 0 IO-APIC-fasteoi ehci_hcd:usb1, eth1 19: 4637 0 14949 6352494 0 0 0 106473 IO-APIC-fasteoi ata_piix, ata_piix 23: 33 0 27 12 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb2 24: 291 0 0 0 0 0 0 0 HPET_MSI-edge hpet2 25: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet3 26: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet4 27: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet5 28: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet6 32: 1275 0 0 0 0 1905 21317086 0 PCI-MSI-edge eth0 NMI: 1873 10150 1974 1672 702 3046 1825 780 Non-maskable interrupts LOC: 17501877 13611350 13868117 3612581 1520650 1850972 8633075 1486682 Local timer interrupts SPU: 0 0 0 0 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 0 0 0 0 Performance monitoring interrupts PND: 0 0 0 0 0 0 0 0 Performance pending work RES: 5238 34250 12858 4299 1555 4833 5663 2485 Rescheduling interrupts CAL: 334 302 429 414 421 464 465 468 Function call interrupts TLB: 7863 154723 12147 11152 14099 33766 42580 11065 TLB shootdowns TRM: 0 0 0 0 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 0 0 0 0 Machine check exceptions MCP: 293 293 293 293 293 293 293 293 Machine check polls ERR: 7 MIS: 0
使用的模块是Centos 6的elrepo的内核模块kmod-aacraid
Installed Packages Name : kmod-aacraid Arch : x86_64 Version : 1.1.7 Release : 1.el6.elrepo Size : 340 k Repo : installed From repo : elrepo Summary : aacraid kernel module(s) URL : http://www.adaptec.com/ License : GPLv2 Description: This package provides the aacraid kernel module(s) built : for the Linux kernel using the x86_64 family of processors.
并从日志中的错误
Dec 15 14:02:33 kernel: irq 16: nobody cared (try booting with the "irqpoll" option) Dec 15 14:02:33 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-71.29.1.el6.x86_64 #1 Dec 15 14:02:33 kernel: Call Trace: Dec 15 14:02:33 kernel: <IRQ> [<ffffffff810da96b>] __report_bad_irq+0x2b/0xa0 Dec 15 14:02:33 kernel: [<ffffffff810dab6c>] note_interrupt+0x18c/0x1d0 Dec 15 14:02:33 kernel: [<ffffffff810db255>] handle_fasteoi_irq+0xc5/0xf0 Dec 15 14:02:33 kernel: [<ffffffff81015fb9>] handle_irq+0x49/0xa0 Dec 15 14:02:33 kernel: [<ffffffff814d093c>] do_IRQ+0x6c/0xf0 Dec 15 14:02:33 kernel: [<ffffffff81013ad3>] ret_from_intr+0x0/0x11 Dec 15 14:02:33 kernel: <EOI> [<ffffffff812da962>] ? acpi_idle_enter_c1+0xa3/0xc1 Dec 15 14:02:33 kernel: [<ffffffff812da941>] ? acpi_idle_enter_c1+0x82/0xc1 Dec 15 14:02:33 kernel: [<ffffffff813df687>] cpuidle_idle_call+0xa7/0x140 Dec 15 14:02:33 kernel: [<ffffffff81011e96>] cpu_idle+0xb6/0x110 Dec 15 14:02:33 kernel: [<ffffffff814c27d8>] start_secondary+0x1fc/0x23f Dec 15 14:02:33 kernel: handlers: Dec 15 14:02:33 kernel: [<ffffffffa002a590>] (aac_rx_intr_message+0x0/0xc0 [aacraid]) Dec 15 14:02:33 kernel: Disabling IRQ #16
我没有看到任何IRQ 16冲突,build议irqpoll选项不会改变一件事情。 我不需要USB,所以我可以禁用它,但系统是生产的,所以我想知道,问题在哪里,在我开始混淆BIOS或任何其他事情之前(我也需要减less停机时间越多越好)。
任何人都可以帮我诊断问题吗?