我要解释的是在两个具有相同操作系统,相同硬件和相同硬件升级的不同服务器上发生的情况。 恕我直言,我认为可能是一个驱动程序错误正在进行,但不知道如何弄清楚。
这个基于SuperMicro主板的服务器有一些奇怪的麻烦。
服务器运行红帽Linux。
当我做“ifconfig eth2 down”时,服务器“挂起”,同样用eth3。
这个eth2和eht3属于上个星期新增的PCI卡。
eth0和eth1集成在主板上,它们使用igb驱动程序。
Eth2和eth3是PCI卡上的新产品,依赖于e1000e驱动程序。
Eth0configuration如下,工作正常。
DEVICE=eth0 ONBOOT=yes BOOTPROTO=none IPADDR=10.0.16.49 NETMASK=255.255.255.0 NETWORK=10.0.16.0 HWADDR=00:xx:xx:xx:xx:5c
Eth1configuration如下,工作正常。
DEVICE=eth1 ONBOOT=yes BOOTPROTO=none IPADDR=192.168.16.46 NETMASK=255.255.255.0
eth2和eth3已经以很多方式configuration了,但是要找出哪些是我用DHCP连接到networking的问题,然后调用dhcpclient eth2或eth3,并且当ifconfig down的计算机仍然挂起。 所以恕我直言,configuration并不重要。
modprobe.conf文件如下所示:
alias eth0 igb alias eth1 igb alias scsi_hostadapter ahci install vtune_drv /opt/intel/vtune/mknod_vtune.sh remove vtune_drv /opt/intel/vtune/rmnod_vtune.sh alias char-major-10-111 mdm
igb和e1000e模块被加载,我可以看到他们与lsmod。
lsmod – > http://pastebin.com/jJ7kk8mn
在ehternet上显示的lspci如下(第一个eth是eth0和eth1)
01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 03:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) 03:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
lspci – > http://pastebin.com/j94fWUPw
lspci -v – > http://pastebin.com/HRdMttzm
以防万一谁关心从dmidecode的BIOS信息是:
Handle 0x0000, DMI type 0, 24 bytes. BIOS Information Vendor: American Megatrends Inc. Version: R4222X52 Release Date: 09/23/2009 Address: 0xF0000 Runtime Size: 64 kB ROM Size: 4096 kB Characteristics: ISA is supported PCI is supported PNP is supported BIOS is upgradeable BIOS shadowing is allowed ESCD support is available Boot from CD is supported Selectable boot is supported BIOS ROM is socketed EDD is supported 5.25"/1.2 MB floppy services are supported (int 13h) 3.5"/720 KB floppy services are supported (int 13h) 3.5"/2.88 MB floppy services are supported (int 13h) Print screen service is supported (int 5h) 8042 keyboard services are supported (int 9h) Serial services are supported (int 14h) Printer services are supported (int 17h) CGA/mono video services are supported (int 10h) ACPI is supported USB legacy is supported LS-120 boot is supported ATAPI Zip drive boot is supported BIOS boot specification is supported Targeted content distribution is supported BIOS Revision: 8.15
boot.log不显示任何有趣的信息从我的POV,但也是这样:
Aug 9 23:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 00:00:01 s_sys@myserver45 IOCMDSTAT: CHECK Aug 10 00:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 01:00:01 s_sys@myserver45 IOCMDSTAT: CHECK Aug 10 01:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 02:00:02 s_sys@myserver45 IOCMDSTAT: CHECK Aug 10 02:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 03:00:02 s_sys@myserver45 IOCMDSTAT: CHECK Aug 10 03:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 04:00:02 s_sys@myserver45 IOCMDSTAT: CHECK Aug 10 04:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 05:00:01 s_sys@myserver45 IOCMDSTAT: CHECK Aug 10 05:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 06:00:01 s_sys@myserver45 IOCMDSTAT: CHECK Aug 10 06:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 07:00:01 s_sys@myserver45 IOCMDSTAT: CHECK Aug 10 07:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 08:00:01 s_sys@myserver45 IOCMDSTAT: CHECK Aug 10 08:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 09:00:01 s_sys@myserver45 IOCMDSTAT: CHECK Aug 10 09:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 10:00:01 s_sys@myserver45 IOCMDSTAT: CHECK Aug 10 10:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 11:00:01 s_sys@myserver45 IOCMDSTAT: CHECK Aug 10 11:00:03 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK Aug 10 11:08:37 s_sys@myserver45 NET[22300]: /sbin/dhclient-script : updated /etc/resolv.conf Aug 10 11:15:29 s_sys@myserver45 IOSIGNAL: BOOT nb_io_adapters=1|nb_local_disks=2 Aug 10 11:15:29 s_sys@myserver45 IOSIGNAL: STATUS OK <A HREF=/storage/iostatus.php?node=myserver45>(I/O status details)</A><BR>All I/O resources are OK
/ var / log / messages – > http://pastebin.com/wBQL1ESE
/ var / log / kernel / info – > http://pastebin.com/3KzF9Hhu
我不知道还有什么可以用的,让我知道。
你目前有2.6.18内核?
也许它正在遭受同样的问题 :
内核在2.6.19 – 2.6.21(含)之间的MSI-X问题
如果在2.6.19和2.6.21之间的内核使用irqbalance,则任何MSI-X硬件都可能出现内核恐慌和不稳定。 如果遇到这样的问题,可以禁用irqbalance守护进程或升级到更新的内核。
这是来自英特尔最新的e1000e自述文件。 所以请尝试禁用irqbalance 。