在不同子网上有两个节点的IP故障切换：无法从第二个节点ping虚拟IP？

我要设置冗余故障转移Redmine ：

另一个实例安装在第二台服务器上没有问题
MySQL（与Redmine在同一台计算机上运行）被configuration为主 – 主复制

因为它们位于不同的子网（192.168.3.x和192.168.6.x），所以似乎VIPArip是唯一的select。

在node1上的/etc/ha.d/ha.cf

 logfacility none debug 1 debugfile /var/log/ha-debug logfile /var/log/ha-log autojoin none warntime 3 deadtime 6 initdead 60 udpport 694 ucast eth1 node2.ip keepalive 1 node node1 node node2 crm respawn

node2上的/etc/ha.d/ha.cf ：

 logfacility none debug 1 debugfile /var/log/ha-debug logfile /var/log/ha-log autojoin none warntime 3 deadtime 6 initdead 60 udpport 694 ucast eth0 node1.ip keepalive 1 node node1 node node2 crm respawn

crm configure show ：

 node $id="6c27077e-d718-4c82-b307-7dccaa027a72" node1 node $id="740d0726-e91d-40ed-9dc0-2368214a1f56" node2 primitive VIPArip ocf:heartbeat:VIPArip \ params ip="192.168.6.8" nic="lo:0" \ op start interval="0" timeout="20s" \ op monitor interval="5s" timeout="20s" depth="0" \ op stop interval="0" timeout="20s" \ meta is-managed="true" property $id="cib-bootstrap-options" \ stonith-enabled="false" \ dc-version="1.0.12-unknown" \ cluster-infrastructure="Heartbeat" \ last-lrm-refresh="1338870303"

crm_mon -1 ：

 ============ Last updated: Tue Jun 5 18:36:42 2012 Stack: Heartbeat Current DC: node2 (740d0726-e91d-40ed-9dc0-2368214a1f56) - partition with quorum Version: 1.0.12-unknown 2 Nodes configured, unknown expected votes 1 Resources configured. ============ Online: [ node1 node2 ] VIPArip (ocf::heartbeat:VIPArip): Started node1

ip addr show lo ：

 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet 192.168.6.8/32 scope global lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever

我可以从节点1（192.168.3.x）ping 192.168.6.8：

 # ping -c 4 192.168.6.8 PING 192.168.6.8 (192.168.6.8) 56(84) bytes of data. 64 bytes from 192.168.6.8: icmp_seq=1 ttl=64 time=0.062 ms 64 bytes from 192.168.6.8: icmp_seq=2 ttl=64 time=0.046 ms 64 bytes from 192.168.6.8: icmp_seq=3 ttl=64 time=0.059 ms 64 bytes from 192.168.6.8: icmp_seq=4 ttl=64 time=0.071 ms --- 192.168.6.8 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3000ms rtt min/avg/max/mdev = 0.046/0.059/0.071/0.011 ms

但不能从节点2（192.168.6.x）和外部ping虚拟IP。我错过了什么？

PS：你可能想要在/usr/lib/ocf/resource.d/heartbeat/VIPArip资源代理脚本中设置IP2UTIL=/sbin/ip ，如果你得到这样的东西：

Jun 5 11:08:10 node1 lrmd：[19832]：info：RA输出：（VIPArip：stop：stderr）2012/06 / 05_11：08：10错误：无效OCF_RESK EY_ip [192.168.6.8]

http://www.clusterlabs.org/wiki/Debugging_Resource_Failures

回复@DukeLion：

哪个路由器接收RIP更新？

当我启动VIPArip资源时，ripd使用下面的configuration文件（在node1上）运行：

/var/run/resource-agents/VIPArip-ripd.conf ：

 hostname ripd password zebra debug rip events debug rip packet debug rip zebra log file /var/log/quagga/quagga.log router rip !nic_tag no passive-interface lo:0 network lo:0 distribute-list private out lo:0 distribute-list private in lo:0 !metric_tag redistribute connected metric 3 !ip_tag access-list private permit 192.168.6.8/32 access-list private deny any

show ip route ：

 Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, A - Babel, > - selected route, * - FIB route K>* 0.0.0.0/0 via 192.168.3.1, eth1 C>* 127.0.0.0/8 is directly connected, lo K>* 169.254.0.0/16 is directly connected, eth1 C>* 192.168.3.0/24 is directly connected, eth1 C>* 192.168.6.8/32 is directly connected, lo

sh ip rip status ：

 Routing Protocol is "rip" Sending updates every 30 seconds with +/-50%, next due in 7 seconds Timeout after 180 seconds, garbage collect after 120 seconds Outgoing update filter list for all interface is not set lo:0 filtered by private Incoming update filter list for all interface is not set lo:0 filtered by private Default redistribution metric is 1 Redistributing: connected Default version control: send version 2, receive any version Interface Send Recv Key-chain Routing for Networks: lo:0 Routing Information Sources: Gateway BadPackets BadRoutes Distance Last Update Distance: (default is 120)

我认为问题不在群集configuration，而是在您的路由架构。

VIPArip资源代理pipe理本地quagga发送路由更新。但是您也需要使用此路由更新来更改路由以指向活动的服务器。我会尽力解释它是如何工作的。

RIP HA

看照片。 HA1和HA2是运行quagga的linux-ha集群成员。蓝色路由器从两个networking链路侦听RIP。

当vip在HA1上升时，quagga将RIP更新发送到蓝色路由器。它添加vip前缀到192.168.1.2下一跳的路由表。

发生故障转移时，vip在HA1上停止，并且quagga完全停止，因此不会发送更新。蓝色路由器会在超时后删除路由表logging，即使VIP不会在HA2上。当VIP在HA2上启动时，会启动quagga，并发送RIP更新。蓝色路由器将使用192.168.2.2下一跳添加logging到路由表。

可以在更复杂的networking拓扑结构中使用viparip，只要确保边界路由器在整个networking中获得路由更新即可。