HBase问题与*一些* regionservers连接

我有一个工作的HBase集群,我试图添加一些新的服务器到集群,但“SocketException:无效的参数”“FailedServerException:此服务器在失败的服务器列表”错误不断得到在日志中生成。

2014-07-02 22:28:01,140 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to master. Retrying. Error was: java.net.SocketException: Invalid argument at sun.nio.ch.Net.connect(Native Method) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:534) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:193) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:528) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:492) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:392) at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:438) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1141) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:988) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:87) at com.sun.proxy.$Proxy10.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:141) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:2040) at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2086) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:748) at java.lang.Thread.run(Thread.java:701) 2014-07-02 22:28:31,764 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to master. Retrying. Error was: org.apache.hadoop.hbase.ipc.HBaseClient$FailedServerException: This server is in the failed servers list: <MY_MASTER_SERVER>/<MY_MASTER_NAME>:<MY_MASTER_PORT> at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:427) at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1141) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:988) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:87) at com.sun.proxy.$Proxy10.getProtocolVersion(Unknown Source) at org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:141) at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208) at org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:2040) at org.apache.hadoop.hbase.regionserver.HRegionServer.reportForDuty(HRegionServer.java:2086) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:748) at java.lang.Thread.run(Thread.java:701) 

到目前为止,我无法在旧服务器和新服务器之间find任何区别:

  • 既运行Ubuntu 12.04的所有最新更新和Cloudera的CDH4的HBase
  • 没有/ etc / hosts有主HBase的条目(虽然我试图在新服务器上添加一个,但仍然有相同的问题)
  • (注意:在新的服务器上,我可以telnet到端口60000,我的HBase Master的端口,没有任何错误)

在debugging的时候,我在网上看到了关于IPv6configuration的一个提及可能引起的问题,但据我所知,新老服务器都有Ubuntu使用的默认configuration。

任何想法如何进一步debugging和/或什么问题可能是?