为什么通过Ansible将实例添加到负载均衡器时，我的实例无法正常运行？

我试图用ec2_elb模块将一个EC2实例添加到一个Elasitic负载平衡器，使用一个Ansible操作手册。这是应该这样做的任务：

 - name: "Add host to load balancer {{ load_balancer_name }}" sudo: false local_action: module: ec2_elb state: present wait: true region: "{{ region }}" ec2_elbs: ['{{ load_balancer_name }}'] instance_id: "{{ ec2_id }}"

然而，它通常会失败，输出（冗长）：

 TASK: [Add host to load balancer ApiELB-staging] ****************************** <127.0.0.1> REMOTE_MODULE ec2_elb region=us-east-1 state=present instance_id=i-eb7e0cc7 <127.0.0.1> EXEC ['/bin/sh', '-c', 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1409156786.81-113716163813868 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1409156786.81-113716163813868 && echo $HOME/.ansible/tmp/ansible-tmp-1409156786.81-113716163813868'] <127.0.0.1> PUT /var/folders/d4/17fw96k107d5kbck6fb2__vc0000gn/T/tmpki4HPF TO /Users/pkaeding/.ansible/tmp/ansible-tmp-1409156786.81-113716163813868/ec2_elb <127.0.0.1> EXEC ['/bin/sh', '-c', u'LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /Users/pkaeding/.ansible/tmp/ansible-tmp-1409156786.81-113716163813868/ec2_elb; rm -rf /Users/pkaeding/.ansible/tmp/ansible-tmp-1409156786.81-113716163813868/ >/dev/null 2>&1'] failed: [10.0.115.149 -> 127.0.0.1] => {"failed": true} msg: The instance i-eb7e0cc7 could not be put in service on LoadBalancer:ApiELB-staging. Reason: Instance has not passed the configured HealthyThreshold number of health checks consecutively. FATAL: all hosts have already failed -- aborting

我有我的ELBconfiguration这样定义（也通过Ansible）：

 - name: "Ensure load balancer exists: {{ load_balancer_name }}" sudo: false local_action: module: ec2_elb_lb name: "{{ load_balancer_name }}" state: present region: "{{ region }}" subnets: "{{ vpc_public_subnet_ids }}" listeners: - protocol: https load_balancer_port: 443 instance_protocol: http instance_port: 8888 ssl_certificate_id: "{{ ssl_cert }}" health_check: ping_protocol: http # options are http, https, ssl, tcp ping_port: 8888 ping_path: "/internal/v1/status" response_timeout: 5 # seconds interval: 30 # seconds unhealthy_threshold: 10 healthy_threshold: 10 register: apilb

当我从我的笔记本电脑或从服务器本身（作为本地主机）访问状态资源时，我得到了预期的200响应。在将实例添加到ELB之前，我还将一个command任务添加到了Ansible操作手册中，以确认该应用程序已启动并正确地提供服务请求：

 - command: /usr/bin/curl -v --fail http://localhost:8888/internal/v1/status

在我的应用程序的日志中，我没有看到对状态检查资源的任何非200响应（但是当然，如果请求从来没有达到我的应用程序那么它们就不会被logging）。

另一个奇怪的是，实例被添加到ELB，它似乎正常工作。所以我知道，至less在某种程度上，负载平衡器可以正确访问应用程序（对于状态检查资源和其他资源）。 AWS控制台显示实例正常，Cloudwatch图表不显示任何失败的运行状况检查。

有任何想法吗？

改编自我之前的评论：

从Ansible文档判断，有一个wait_timeout参数，你必须设置成高于300的值才能工作。（330将是安全的）。

或者你可以降低你的interval或healthy_threshold或两者，所以你必须等待不到300秒。

您的unhealthy_threshold与healthy_threshold相同，因此一旦Web服务器开始投掷500个响应，它将在ELB放弃之前停留在池中5分钟。

你可以使用ec2_elb选项wait: no 。