我们已经build立了3台运行keepalived的服务器。 我们开始注意到一些随机的重新选举,我们无法解释,因此我在这里寻求build议。
这是我们的configuration:
主:
global_defs { notification_email { [email protected] } notification_email_from keepalived@hostname smtp_server example.com:587 smtp_connect_timeout 30 router_id some_rate } vrrp_script chk_nginx { script "killall -0 nginx" interval 2 weight 2 } vrrp_instance VIP_61 { interface bond0 virtual_router_id 61 state MASTER priority 100 advert_int 1 authentication { auth_type PASS auth_pass PASSWORD } virtual_ipaddress { XXXX XXXX XXXX } track_script { chk_nginx } }
BACKUP1:
global_defs { notification_email { [email protected] } notification_email_from keepalived@hostname smtp_server example.com:587 smtp_connect_timeout 30 router_id some_rate } vrrp_script chk_nginx { script "killall -0 nginx" interval 2 weight 2 } vrrp_instance VIP_61 { interface bond0 virtual_router_id 61 state MASTER priority 99 advert_int 1 authentication { auth_type PASS auth_pass PASSWORD } virtual_ipaddress { XXXX XXXX XXXX } track_script { chk_nginx } }
备份服务器:
global_defs { notification_email { [email protected] } notification_email_from keepalived@hostname smtp_server example.com:587 smtp_connect_timeout 30 router_id some_rate } vrrp_script chk_nginx { script "killall -0 nginx" interval 2 weight 2 } vrrp_instance VIP_61 { interface bond0 virtual_router_id 61 state MASTER priority 98 advert_int 1 authentication { auth_type PASS auth_pass PASSWORD } virtual_ipaddress { XXXX XXXX XXXX } track_script { chk_nginx } }
每隔一段时间我都可以看到发生了这种情况(用日志logging):
主:
Jan 6 18:30:15 lb-public01 Keepalived_vrrp[24380]: VRRP_Instance(VIP_61) Received lower prio advert, forcing new election Jan 6 18:30:16 lb-public01 Keepalived_vrrp[24380]: VRRP_Instance(VIP_61) Received lower prio advert, forcing new election Jan 6 18:32:37 lb-public01 Keepalived_vrrp[24380]: VRRP_Instance(VIP_61) Received lower prio advert, forcing new election
BACKUP1:
Jan 6 18:30:16 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Transition to MASTER STATE Jan 6 18:30:16 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Received higher prio advert Jan 6 18:30:16 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Entering BACKUP STATE Jan 6 18:32:37 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) forcing a new MASTER election Jan 6 18:32:38 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Transition to MASTER STATE Jan 6 18:32:38 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Received higher prio advert Jan 6 18:32:38 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Entering BACKUP STATE
备份服务器:
Jan 6 18:32:36 lb-public03 Keepalived_vrrp[14255]: VRRP_Script(chk_nginx) succeeded Jan 6 18:32:37 lb-public03 Keepalived_vrrp[14255]: VRRP_Instance(VIP_61) Transition to MASTER STATE Jan 6 18:32:37 lb-public03 Keepalived_vrrp[14255]: VRRP_Instance(VIP_61) Received higher prio advert Jan 6 18:32:37 lb-public03 Keepalived_vrrp[14255]: VRRP_Instance(VIP_61) Entering BACKUP STATE
所以MASTER收到LOWER PRIO广告,NEW选举开始。 为什么? 看起来像BACKUP转换到MASTER一小段时间(基于日志),然后失败回到BACKUP状态。 我很无知,为什么这是真的发生,所以任何暗示将是多余的欢迎。
另外,我发现在keepalived中有一个单播补丁 ,但是我不清楚它是否支持多个单播节点 – 在我们的例子中,我们有一个3台机器的集群,所以我们需要多于一个单播节点。
任何有关这些问题的提示都会被高度赞赏!
问题是你使用默认状态MASTER作为备份节点。 他们应该说明备份。
vrrp_instance VIP_61 { interface bond0 virtual_router_id 61 state BACKUP priority 98 ...
希望这解决了你的神秘莫测。