食人鱼/脉冲,lvs.cf与持久性和服务器故障

我们有以下设置:

  • RedHat 6
  • LVS设置为在两个Web服务器之间失败
  • 连接持续900秒

这是一个非常简单的设置,但是当一台服务器被标记为失败时,piranha / pulse / nanny进程将表中服务器的权重标记为0,并且不会删除发生故障的服务器。 这意味着任何持久性连接都会保留连接到发生故障的服务器,并且负载平衡将被打败。

我们怎么能告诉保姆强制失败的节点,所以持久连接失败的工作节点?

谢谢


我们有以下lvs.cf:

serial_no = 201305302344 primary = 10.1.1.45 service = lvs backup = 0.0.0.0 heartbeat = 1 heartbeat_port = 539 keepalive = 6 deadtime = 18 network = nat nat_router = 10.1.1.70 eth0:1 nat_nmask = 255.255.255.0 debug_level = NONE virtual http { active = 1 address = 10.1.1.70 eth0:1 vip_nmask = 255.255.255.0 persistent = 900 pmask = 255.255.255.0 port = 80 send = "GET / HTTP/1.0\r\n\r\n" expect = "HTTP/1.1 200 OK" use_regex = 0 load_monitor = none scheduler = wlc protocol = tcp timeout = 6 reentry = 15 quiesce_server = 1 server web1 { address = 10.1.1.51 active = 1 weight = 1 } server web2 { address = 10.1.1.52 active = 1 weight = 1 } } virtual https { active = 1 address = 10.1.1.70 eth0:1 vip_nmask = 255.255.255.0 port = 443 persistent = 900 pmask = 255.255.255.0 send = "GET / HTTP/1.0\r\n\r\n" expect = "up" use_regex = 0 load_monitor = none scheduler = wlc protocol = tcp timeout = 6 reentry = 15 quiesce_server = 1 server web1 { address = 10.1.1.51 active = 1 weight = 1 } server web2 { address = 10.1.1.52 active = 1 weight = 1 } } 

尝试echo 1 > /proc/sys/net/ipv4/vs/expire_quiescent_template

更多细节在这里:

http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.persistent_connection.html

你必须触发一个导演的故障/恢复脚本,删除/添加该导演。

我为此使用了lvs-kiss ,它有一个包含这些情况的脚本的语法。