不同主机上的Broadcom NetXreme网卡之间的数据包丢失

我有一个有趣的问题,我在同一个networking中的多个服务器之间丢失数据包。 这发生在大约15个主机上,但我将其压缩到下面的三个。

首先是一些拓扑。 相同的所有机器上。

hosta - 10.20.30.1; Debian Lenny 5.0.5 2.6.26-2-686 #1 SMP, firmware-bnx2 0.14+lenny2 hostb - 10.20.30.2; Debian Lenny 5.0.5 2.6.26-2-686 #1 SMP, firmware-bnx2 0.14+lenny2 hostc - 10.20.30.3; Debian Lenny 5.0.5 2.6.26-2-686 #1 SMP, firmware-bnx2 0.14+lenny2 

lspci给我…

 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12) 

所有的服务器插入Cisco 2900XL。 我已经把它改成了我们在现场使用的TeloSystems交换机,以确保它不是思科。

这些服务器都是IBM x3550和x3560(预M1 / M2)。

现在进行一些testing…我只会粘贴testing的一面来节省空间,但如果我使用其他主机,结果是100%相同的。

 root@hosta:~# ping -i 0.5 -c 100 10.20.30.2 -q PING 10.20.30.2 (10.20.30.2) 56(84) bytes of data. --- 10.20.30.2 ping statistics --- 100 packets transmitted, 100 received, 0% packet loss, time 49542ms rtt min/avg/max/mdev = 0.097/0.157/5.533/0.540 ms root@hosta:~# ping -i 0.1 -c 100 10.20.30.2 -q PING 10.20.30.2 (10.20.30.2) 56(84) bytes of data. --- 10.20.30.2 ping statistics --- 100 packets transmitted, 100 received, 0% packet loss, time 9941ms rtt min/avg/max/mdev = 0.089/0.105/0.170/0.017 ms root@hosta:~# ping -i 0.05 -c 100 10.20.30.2 -q PING 10.20.30.2 (10.20.30.2) 56(84) bytes of data. --- 10.20.30.2 ping statistics --- 100 packets transmitted, 100 received, 0% packet loss, time 5167ms rtt min/avg/max/mdev = 0.088/0.096/0.170/0.016 ms root@hosta:~# ping -i 0.01 -c 100 10.20.30.2 -q PING 10.20.30.2 (10.20.30.2) 56(84) bytes of data. --- 10.20.30.2 ping statistics --- 100 packets transmitted, 79 received, 21% packet loss, time 960ms rtt min/avg/max/mdev = 0.088/0.095/0.126/0.009 ms root@hosta:~# ping -i 0.025 -c 100 10.20.30.2 -q PING 10.20.30.2 (10.20.30.2) 56(84) bytes of data. --- 10.20.30.2 ping statistics --- 100 packets transmitted, 100 received, 0% packet loss, time 2800ms rtt min/avg/max/mdev = 0.087/0.097/0.120/0.006 ms root@hosta:~# ping -i 0.02 -c 100 10.20.30.2 -q PING 10.20.30.2 (10.20.30.2) 56(84) bytes of data. --- 10.20.30.2 ping statistics --- 100 packets transmitted, 100 received, 0% packet loss, time 2002ms rtt min/avg/max/mdev = 0.085/0.096/0.164/0.017 ms root@hosta:~# ping -i 0.019 -c 100 10.20.30.2 -q PING 10.20.30.2 (10.20.30.2) 56(84) bytes of data. --- 10.20.30.2 ping statistics --- 100 packets transmitted, 99 received, 1% packet loss, time 1995ms rtt min/avg/max/mdev = 0.085/0.092/0.112/0.014 ms root@hosta:~# ping -i 0.015 -c 100 10.20.30.2 -q PING 10.20.30.2 (10.20.30.2) 56(84) bytes of data. --- 10.20.30.2 ping statistics --- 100 packets transmitted, 92 received, 8% packet loss, time 1614ms rtt min/avg/max/mdev = 0.086/0.099/0.161/0.016 ms root@hosta:~# ping -i 0.0125 -c 100 10.20.30.2 -q PING 10.20.30.2 (10.20.30.2) 56(84) bytes of data. --- 10.20.30.2 ping statistics --- 100 packets transmitted, 84 received, 16% packet loss, time 1198ms rtt min/avg/max/mdev = 0.083/0.093/0.136/0.012 ms 

如果我将MBP连接到交换机(两者),则在运行上述testing时不会丢失数据包。

从9个月前我们从Etch升级到Lenny之后,这似乎只是发生了。

我下一步是刻录一个Ubuntu Live CD来从另一个更新的内核做一些testing。

任何帮助/想法/指针将不胜感激。

这里的Serverfaults关于这个问题的官方答案: http : //blog.serverfault.com/post/broadcom-die-mutha/