ESX 6 arp在后退2链接中失败

我们在ESX 6和CentOS 7机器之间的后退连接有一个奇怪的问题。 希望能find一个解决scheme在stackoverflow

故事如下: – 我们使用直接连接到ESX的CentOS 7,并将其用作iSCSI NAS – ESX不时会看到NAS无法看到NAS,相应的DataStore将无法访问 – 发生这种情况时,我们会检查所有内容没有任何物理错误,NIC上的指示灯亮起,Linux上的ethtool和ESX报告链接正常 – 当我们检查arp时,Linux知道ESX接口,但ESX不是,而且它的arpcaching表示不完整。 – 当我们使用tcpdump检查ARP / RARP数据包时,发生了一些奇怪的事情,在Linux中,从ESX接口接收到ARP,并且tcpdump显示了Linux对ARP请求的回复,ESX上的每个tcpdump上都没有Linux发送的ARP回复。 – 不知怎的,链接似乎成了一条单向的路!

检查命令和结果我们在search一个线索:

在CentOS 7上

[root@nas ~]# arp -an ? (10.10.10.2) at 00:50:56:XX:0d:77 [ether] on enp3s6 ? (192.168.70.254) at 00:50:56:XX:99:c7 [ether] on enp5s0 [root@nas ~]# tcpdump -nnvli enp3s6 arp tcpdump: listening on enp3s6, link-type EN10MB (Ethernet), capture size 65535 bytes 07:52:25.143360 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.10.1 tell 10.10.10.2, length 46 07:52:25.143367 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.10.10.1 is-at 00:07:e9:XX:07:93, length 28 07:52:26.143452 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.10.1 tell 10.10.10.2, length 46 07:52:26.143454 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.10.10.1 is-at 00:07:e9:XX:07:93, length 28 07:52:27.145667 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.10.1 tell 10.10.10.2, length 46 07:52:27.145673 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.10.10.1 is-at 00:07:e9:XX:07:93, length 28 

在ESX 6上

 [root@gahar:~] tcpdump-uw -nnvli vmk2 arp tcpdump-uw: listening on vmk2, link-type EN10MB (Ethernet), capture size 96 bytes 07:52:25.523005 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.10.1 tell 10.10.10.2, length 28 07:52:26.523247 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.10.1 tell 10.10.10.2, length 28 07:52:27.524461 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.10.1 tell 10.10.10.2, length 28 07:52:31.079580 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.10.1 tell 10.10.10.2, length 28 07:52:31.079634 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.10.1 tell 10.10.10.2, length 28 07:52:32.080746 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.10.1 tell 10.10.10.2, length 28 07:52:33.081656 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.10.10.1 tell 10.10.10.2, length 28 [root@gahar:~] ping 10.10.10.1 PING 10.10.10.1 (10.10.10.1): 56 data bytes sendto() failed (Host is down) [root@gahar:~] esxcli network ip neighbor list Neighbor Mac Address Vmknic Expiry State Type -------------- ----------------- ------ -------- ----- ------- 192.168.33.10 00:0c:29:XX:ea:60 vmk0 965 sec Unknown 192.168.33.254 00:50:56:XX:99:c7 vmk0 1194 sec Unknown 10.10.10.1 (incomplete) vmk2 -3 sec Unknown 

临时解决方法:

 [root@gahar:~] esxcli network nic down -n vmnic2 [root@gahar:~] esxcli network nic up -n vmnic2 [root@gahar:~] ping 10.10.10.1 PING 10.10.10.1 (10.10.10.1): 56 data bytes 64 bytes from 10.10.10.1: icmp_seq=0 ttl=64 time=0.207 ms 64 bytes from 10.10.10.1: icmp_seq=1 ttl=64 time=0.212 ms 64 bytes from 10.10.10.1: icmp_seq=2 ttl=64 time=0.257 ms --- 10.10.10.1 ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 0.207/0.225/0.257 ms 

具有以上所有,我正在寻找一个解决scheme。 我找不到根本原因。