我有AWS的Keepalived的主/从设置为Haproxy和EIP被用作VIP。 偶尔备份服务器触发故障转移,但MASTER节点是健康的。 以下是相应的日志。
备份服务器
Oct 10 04:14:32 Prod-WebAccessLb2 Keepalived_vrrp[2271]: VRRP_Instance(ProdWebCluster) Transition to MASTER STATE Oct 10 04:14:33 Prod-WebAccessLb2 Keepalived_vrrp[2271]: VRRP_Instance(ProdWebCluster) Entering MASTER STATE Oct 10 04:14:33 Prod-WebAccessLb2 Keepalived_vrrp[2271]: Opening script file /etc/keepalived/master.sh Oct 10 04:14:34 Prod-WebAccessLb2 Keepalived_vrrp[2271]: VRRP_Instance(ProdWebCluster) Received advert with higher priority 200, ours 100 Oct 10 04:14:34 Prod-WebAccessLb2 Keepalived_vrrp[2271]: VRRP_Instance(ProdWebCluster) Entering BACKUP STATE
主服务器
Oct 10 04:14:35 Prod-WebAccessLb1 Keepalived_vrrp[1311]: VRRP_Instance(ProdWebCluster) Received advert with lower priority 100, ours 200, forcing new election Oct 10 04:14:35 Prod-WebAccessLb1 Keepalived_vrrp[1311]: VRRP_Instance(ProdWebCluster) Received advert with lower priority 100, ours 200, forcing new election
所以在查看日志之后,我们可以在故障转移之后立即说出MASTER节点触发故障恢复,但不会运行master.sh和VIP留在空中。
以下是MASTERconfiguration
vrrp_script chk_haproxy { script "/bin/pidof haproxy" interval 1 } vrrp_instance ProdWebCluster { debug 2 interface eth0 state MASTER virtual_router_id 33 priority 151 unicast_src_ip 10.186.2.10 unicast_peer { 10.186.6.10 } authentication { auth_type PASS auth_pass xxx } track_script { chk_haproxy }
备份服务器configuration
vrrp_instance ProdWebCluster { debug 2 interface eth0 state BACKUP virtual_router_id 33 priority 100 unicast_src_ip 10.186.6.10 unicast_peer { 10.186.2.10 } authentication { auth_type PASS auth_pass xxxx } track_script { chk_haproxy }
任何人都可以告诉为什么它触发故障转移在第一个地方? 以及为什么在故障恢复期间它不运行master.sh?
注意:当我从Backup / Master手动运行master.sh脚本时,它的工作方式就像是假设的那样。