我在2.0.9上有一个syslog-ng实例,这是旧的,但是…这是企业IT和升级版本是…有趣…运行在Solaris 10.我有这个奇怪的问题,有些客户端不要在TCP上保持连接到服务器。
当一个客户端正在工作,我可以在客户端启动syslog-ng,它连接和发送数据,并保持连接…
12:20:13.200547 IP (tos 0x0, ttl 64, id 13064, offset 0, flags [DF], proto: TCP (6), length: 60) 10.37.128.185.35765 > 10.37.141.31.shell: S, cksum 0xade4 (correct), 1572869826:1572869826(0) win 5840 <mss 1460,sackOK,timestamp 958735818 0,nop,wscale 7> 12:20:13.202279 IP (tos 0x0, ttl 63, id 27707, offset 0, flags [DF], proto: TCP (6), length: 64) 10.37.141.31.shell > 10.37.128.185.35765: S, cksum 0x434d (correct), 3180100791:3180100791(0) ack 1572869827 win 32942 <nop,nop,timestamp 2210148518 958735818,mss 1460,nop,wscale 2,nop,nop,sackOK> 12:20:13.202327 IP (tos 0x0, ttl 64, id 13065, offset 0, flags [DF], proto: TCP (6), length: 52) 10.37.128.185.35765 > 10.37.141.31.shell: ., cksum 0x0499 (correct), ack 1 win 46 <nop,nop,timestamp 958735820 2210148518> 12:20:13.202823 IP (tos 0x0, ttl 64, id 13066, offset 0, flags [DF], proto: TCP (6), length: 140) 10.37.128.185.35765 > 10.37.141.31.shell: P, cksum 0x179d (correct), 1:89(88) ack 1 win 46 <nop,nop,timestamp 958735820 2210148518> 12:20:13.204061 IP (tos 0x0, ttl 63, id 27708, offset 0, flags [DF], proto: TCP (6), length: 52) 10.37.141.31.shell > 10.37.128.185.35765: ., cksum 0x83d6 (correct), ack 89 win 32920 <nop,nop,timestamp 2210148518 958735820> 12:20:13.205558 IP (tos 0x0, ttl 64, id 13067, offset 0, flags [DF], proto: TCP (6), length: 124) 10.37.128.185.35765 > 10.37.141.31.shell: P, cksum 0xc071 (correct), 89:161(72) ack 1 win 46 <nop,nop,timestamp 958735823 2210148518> 12:20:13.206247 IP (tos 0x0, ttl 63, id 27709, offset 0, flags [DF], proto: TCP (6), length: 52) 10.37.141.31.shell > 10.37.128.185.35765: ., cksum 0x839d (correct), ack 161 win 32902 <nop,nop,timestamp 2210148518 958735823>
当一个客户端不能保持连接,我看到服务器立即断开与FIN …
12:20:02.441949 IP (tos 0x10, ttl 64, id 8231, offset 0, flags [DF], proto: TCP (6), length: 60) 10.37.128.185.46121 > 10.37.141.31.shell: S, cksum 0xeb7e (correct), 1553390564:1553390564(0) win 5840 <mss 1460,sackOK,timestamp 958725059 0,nop,wscale 7> 12:20:02.443817 IP (tos 0x0, ttl 63, id 27678, offset 0, flags [DF], proto: TCP (6), length: 64) 10.37.141.31.shell > 10.37.128.185.46121: S, cksum 0xe379 (correct), 3007391908:3007391908(0) ack 1553390565 win 32942 <nop,nop,timestamp 2210147442 958725059,mss 1460,nop,wscale 2,nop,nop,sackOK> 12:20:02.443840 IP (tos 0x10, ttl 64, id 8232, offset 0, flags [DF], proto: TCP (6), length: 52) 10.37.128.185.46121 > 10.37.141.31.shell: ., cksum 0xa4c5 (correct), ack 1 win 46 <nop,nop,timestamp 958725061 2210147442> 12:20:02.445689 IP (tos 0x0, ttl 63, id 27679, offset 0, flags [DF], proto: TCP (6), length: 52) 10.37.141.31.shell > 10.37.128.185.46121: F, cksum 0x2444 (correct), 1:1(0) ack 1 win 32942 <nop,nop,timestamp 2210147442 958725061> 12:20:02.445737 IP (tos 0x10, ttl 64, id 8233, offset 0, flags [DF], proto: TCP (6), length: 52) 10.37.128.185.46121 > 10.37.141.31.shell: F, cksum 0xa4c1 (correct), 1:1(0) ack 2 win 46 <nop,nop,timestamp 958725063 2210147442> 12:20:02.447244 IP (tos 0x0, ttl 63, id 27680, offset 0, flags [DF], proto: TCP (6), length: 52) 10.37.141.31.shell > 10.37.128.185.46121: ., cksum 0x2441 (correct), ack 2 win 32942 <nop,nop,timestamp 2210147442 958725063>
现在这个问题最初被认为是在不同的客户,但在这种情况下,它是同一个盒子。 我通过重新启动客户端syslog-ng服务和从telnet到服务器端口的不成功的消息生成了成功的消息。
我也开始在另一个端口上的syslog-ng服务器的新实例,并在本地主机上的一个telnet到514连接和断开连接…
$ telnet localhost 514 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. Connection to localhost closed by foreign host
但是在另一个港口,在一个新的过程中,我们得到了一个很好的连接。
$ telnet localhost 1140 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. ^] telnet quit Connection to localhost closed.
因此,在syslog-ng或Solaris 10中的某些内容在运行进程的一段未确定的时间之后似乎对这些连接中的一些不太喜欢。 这与tcpwrappers链接,hosts.allow中定义了“syslog-ng:ALL”,我看到的行为看起来类似于tcpwrappers阻止连接的情况,但我认为这是系统故障的一部分,因为它似乎是通用的。
“本地主机到新进程”的行为看起来和远程连接是一样的,看起来不像防火墙阻碍奇怪的事情。 我迷路了。
任何猜测,指针赞赏!
检查syslog.conf中的max-connections设置 – 默认为10,这对您来说可能太低了。