当sysctl tcp_retries1设置为3时，TCP数据包被重新传输7次 – 为什么？

Ubuntu 12.04

我试图更好地理解TCP在尝试重传数据包时没有收到目的地收到的确认信息。看完tcp手册页后，看起来很清楚，这是由sysctl tcp_retries1控制的：

tcp_retries1 (integer; default: 3) The number of times TCP will attempt to retransmit a packet on an established connection normally, without the extra effort of getting the network layers involved. Once we exceed this number of retransmits, we first have the network layer update the route if possible before each new retransmit. The default is the RFC specified minimum of 3.

我的系统设置为默认值3：

 # cat /proc/sys/net/ipv4/tcp_retries1 3

为了testing这个，我通过ssh从系统A（172.16.249.138）连接到系统B（172.16.249.137），并在控制台上启动了一个简单的打印循环。然后，在通信发生时，我突然将B从networking中断开。

在另一个terminal中，我在系统A上运行“tcpdump host 172.16.249.137”。下面是输出中的相关行（为了清楚起见添加了行号）。

 00: ... 01: 13:29:46.994715 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [.], ack 5989441, win 80, options [nop,nop,TS val 1957286 ecr 4294962520], length 0 02: 13:29:46.995084 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [.], ack 5989441, win 186, options [nop,nop,TS val 1957286 ecr 4294962520], length 0 03: 13:29:47.040360 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 186, options [nop,nop,TS val 1957298 ecr 4294962520], length 48 04: 13:29:47.086552 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [.], ack 5989441, win 376, options [nop,nop,TS val 1957309 ecr 4294962520], length 0 05: 13:29:47.680608 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1957458 ecr 4294962520], length 48 06: 13:29:48.963721 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1957779 ecr 4294962520], length 48 07: 13:29:51.528564 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1958420 ecr 4294962520], length 48 08: 13:29:56.664384 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1959704 ecr 4294962520], length 48 09: 13:30:06.936480 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1962272 ecr 4294962520], length 48 10: 13:30:27.480381 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1967408 ecr 4294962520], length 48 11: 13:31:08.504033 IP 172.16.249.138.50489 > 172.16.249.137.ssh: Flags [P.], seq 29136:29184, ack 5989441, win 376, options [nop,nop,TS val 1977664 ecr 4294962520], length 48 12: 13:31:13.512437 ARP, Request who-has 172.16.249.137 tell 172.16.249.138, length 28 13: 13:31:14.512336 ARP, Request who-has 172.16.249.137 tell 172.16.249.138, length 28 14: 13:31:15.512241 ARP, Request who-has 172.16.249.137 tell 172.16.249.138, length 28

如果我正确地解释了这一点（我可能不是），那么第3行的数据包永远不会被系统B确认。然后，每增加一次重传定时器，就重发7次（第5-11行）时间）。

为什么数据包被重传7次而不是3次？

注意：我发现几个pcap文件后，通过HTTP连接发生了6-7次重传，所以重传次数似乎不是特定于SSH的。

我相信你通过杀死.137服务器上的连接创build了一个孤立的套接字。因此，正在使用的内核参数是tcp_orphan_retries – 它具有通用的linux缺省值7。

您可以在此处获得您创build的条件和结果的说明： http : //www.linuxinsight.com/proc_sys_net_ipv4_tcp_orphan_retries.html