在activemq.log ,我每2秒就会看到这个exception:
2013-11-05 13:00:52,374 | DEBUG | Transport Connection to: tcp://127.0.0.1:37501 failed: org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://127.0.0.1:37501 | org.apache.activemq.broker.TransportConnection.Transport | Async Exception Handler org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://127.0.0.1:37501 at org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:282) at org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:271) at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85) at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104) at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68) at org.apache.activemq.broker.TransportConnection.dispatch(TransportConnection.java:1312) at org.apache.activemq.broker.TransportConnection.processDispatch(TransportConnection.java:838) at org.apache.activemq.broker.TransportConnection.iterate(TransportConnection.java:873) at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129) at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
由于这个关键字InactivityIOException ,我想到的第一件事是InactivityMonitor ,但奇怪的是MaxInactivityDuration=30000 :
2013-11-05 13:11:02,672 | DEBUG | Sending: WireFormatInfo { version=9, properties={MaxFrameSize=9223372036854775807, CacheSize=1024, CacheEnabled=true, SizePrefixDisabled=false, MaxInactivityDurationInitalDelay=10000, TcpNoDelayEnabled=true, MaxInactivityDuration=30000, TightEncodingEnabled=true, StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]} | org.apache.activemq.transport.WireFormatNegotiator | ActiveMQ BrokerService[localhost] Task-2
而且,我也没有看到这样的事情:
自上次阅读以来没有收到任何消息…
要么:
频道太长(> 30000)无效
做一个netstat ,我看到这些连接在TIME_WAIT状态:
tcp 0 0 127.0.0.1:38545 127.0.0.1:61616 TIME_WAIT - tcp 0 0 127.0.0.1:38544 127.0.0.1:61616 TIME_WAIT - tcp 0 0 127.0.0.1:38522 127.0.0.1:61616 TIME_WAIT -
请注意,“PID /程序名称”列仅包含破折号字符。
这里是运行tcpdump时的输出:
Internet Protocol Version 4, Src: 127.0.0.1 (127.0.0.1), Dst: 127.0.0.1 (127.0.0.1) Version: 4 Header length: 20 bytes Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport)) 0000 00.. = Differentiated Services Codepoint: Default (0x00) .... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00) Total Length: 296 Identification: 0x7b6a (31594) Flags: 0x02 (Don't Fragment) 0... .... = Reserved bit: Not set .1.. .... = Don't fragment: Set ..0. .... = More fragments: Not set Fragment offset: 0 Time to live: 64 Protocol: TCP (6) Header checksum: 0xc063 [correct] [Good: True] [Bad: False] Source: 127.0.0.1 (127.0.0.1) Destination: 127.0.0.1 (127.0.0.1) Transmission Control Protocol, Src Port: 61616 (61616), Dst Port: 54669 (54669), Seq: 1, Ack: 2, Len: 244 Source port: 61616 (61616) Destination port: 54669 (54669) [Stream index: 11] Sequence number: 1 (relative sequence number) [Next sequence number: 245 (relative sequence number)] Acknowledgement number: 2 (relative ack number) Header length: 32 bytes Flags: 0x018 (PSH, ACK) 000. .... .... = Reserved: Not set ...0 .... .... = Nonce: Not set .... 0... .... = Congestion Window Reduced (CWR): Not set .... .0.. .... = ECN-Echo: Not set .... ..0. .... = Urgent: Not set .... ...1 .... = Acknowledgement: Set .... .... 1... = Push: Set .... .... .0.. = Reset: Not set .... .... ..0. = Syn: Not set .... .... ...0 = Fin: Not set Window size value: 256 [Calculated window size: 32768] [Window size scaling factor: 128] Checksum: 0xff1c [validation disabled] [Good Checksum: False] [Bad Checksum: False] Options: (12 bytes) No-Operation (NOP) No-Operation (NOP) Timestamps: TSval 2304161892, TSecr 2304161891 Kind: Timestamp (8) Length: 10 Timestamp value: 2304161892 Timestamp echo reply: 2304161891 [SEQ/ACK analysis] [Bytes in flight: 244] Constrained Application Protocol, TID: 240, Length: 244 00.. .... = Version: 0 ..00 .... = Type: Confirmable (0) .... 0000 = Option Count: 0 Code: Unknown (0) Transaction ID: 240 Payload Content-Type: text/plain (default), Length: 240, offset: 4 Line-based text data: text/plain [truncated] \001ActiveMQ\000\000\000\t\001\000\000\000<DE>\000\000\000\t\000\fMaxFrameSize\006\177<FF><FF><FF><FF> <FF><FF><FF>\000\tCacheSize\005\000\000\004\000\000\fCacheEnabled\001\001\000\022SizePrefixDisabled\001\000\000 MaxInactivityDurationInitalDelay\006\
这很可能是一个TCP端口检查 。 这是我从其他主机尝试telnet时看到的内容:
2013-11-05 16:12:41,071 | DEBUG | Transport Connection to: tcp://10.8.20.9:46775 failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///10.8.20.9:46775@61616 java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:275) at org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:229) at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:221) at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:204) at java.lang.Thread.run(Thread.java:662) 2013-11-05 16:12:41,071 | DEBUG | Transport Connection to: tcp://10.8.20.9:46775 failed: org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection.Transport | Async Exception Handler org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://10.8.20.9:46775 at org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:282) at org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:271) at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85) at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104) at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68) at org.apache.activemq.broker.TransportConnection.dispatch(TransportConnection.java:1312) at org.apache.activemq.broker.TransportConnection.processDispatch(TransportConnection.java:838) at org.apache.activemq.broker.TransportConnection.iterate(TransportConnection.java:873) at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129) at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2013-11-05 16:12:41,071 | DEBUG | Unregistering MBean org.apache.activemq:BrokerName=localhost,Type=Connection,ConnectorName=ope nwire,ViewType=address,Name=tcp_//10.8.20.9_46775 | org.apache.activemq.broker.jmx.ManagementContext | ActiveMQ Transport: tcp:/ //10.8.20.9:46775@61616 2013-11-05 16:12:41,073 | DEBUG | Stopping connection: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,073 | DEBUG | Stopping transport tcp:///10.8.20.9:46775@61616 | org.apache.activemq.transport.tcp.TcpTranspo rt | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,073 | DEBUG | Initialized TaskRunnerFactory[ActiveMQ Task] using ExecutorService: java.util.concurrent.Threa dPoolExecutor@23cc2a28 | org.apache.activemq.thread.TaskRunnerFactory | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,074 | DEBUG | Closed socket Socket[addr=/10.8.20.9,port=46775,localport=61616] | org.apache.activemq.transpo rt.tcp.TcpTransport | ActiveMQ Task-1 2013-11-05 16:12:41,074 | DEBUG | Forcing shutdown of ExecutorService: java.util.concurrent.ThreadPoolExecutor@23cc2a28 | org.apache.activemq.util.ThreadPoolUtils | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,074 | DEBUG | Stopped transport: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,074 | DEBUG | Connection Stopped: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,902 | DEBUG | Sending: WireFormatInfo { version=9, properties={MaxFrameSize=9223372036854775807, CacheSize=1024, CacheEnabled=true, SizePrefixDisabled=false, MaxInactivityDurationInitalDelay=10000, TcpNoDelayEnabled=true, MaxInactivityDuration=30000, TightEncodingEnabled=true, StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]} | org.apache.activemq.transport.WireFormatNegotiator | ActiveMQ BrokerService[localhost] Task-5
所以问题是:我怎样才能find正在尝试连接到我的ActiveMQ(从本地主机)每2秒?
更新Thu Nov 7 10:44:09 ICT 2013
我的同事曾经跟我说过这个罪魁祸首,这是…… keepalived :
vrrp_script chk_SC_ACTIVEMQ { script "</dev/tcp/127.0.0.1/61616" interval 2 # check every 2 seconds weight 2 # add 2 points of prio if OK }
但是我的问题仍然存在:假设我知道有人(在本地主机上)每隔2秒打开一次到我的ActiveMQ的TCP连接:
tcp 0 0 127.0.0.1:59408 127.0.0.1:61616 TIME_WAIT - tcp 0 0 127.0.0.1:59398 127.0.0.1:61616 TIME_WAIT - tcp 0 0 127.0.0.1:59399 127.0.0.1:61616 TIME_WAIT -
如何找出犯罪者? 我只是想了一个办法,就是扫描/etc (甚至整个系统)找出所有包含“61616”的文件:
# grep -lr 61616 /etc
有没有更好的办法?