我一直在尝试熟悉CoreOS(633.1.0),我一直在玩舰队(我在本地机器上使用推荐的Vagrant 3集群设置)。 我创build了以下非常基本的服务( [email protected] ):
[Unit] Description="Dummy Apache service" After="docker.service" Requires="docker.service" [Service] TimeoutStartSec=0 TimeoutStopSec=30 ExecStartPre=-/usr/bin/docker kill apache1 ExecStartPre=-/usr/bin/docker rm apache1 ExecStartPre=/usr/bin/docker pull coreos/apache ExecStart=/usr/bin/docker run --rm --name apache1 -p 80:80 coreos/apache /usr/sbin/apache2ctl -D FOREGROUND ExecStop=/usr/bin/docker stop apache1 [X-Fleet] Conflicts=apache@*.service
当我启动它时,它只是完美的工作,但是,每当我试图阻止它,它被标记为失败。 当stop命令开始执行时,这是例如fleetctl status apache@2的输出:
core@core-01 ~ $ fleetctl stop apache@2 Unit [email protected] loaded on a91e28b0.../172.17.8.101 core@core-01 ~ $ fleetctl status apache@2 ● [email protected] - "Dummy Apache container" Loaded: loaded (/run/fleet/units/[email protected]; linked-runtime; vendor preset: disabled) Active: deactivating (stop) since Wed 2015-04-15 18:45:46 UTC; 2s ago Process: 1038 ExecStartPre=/usr/bin/docker pull coreos/apache (code=exited, status=0/SUCCESS) Process: 1030 ExecStartPre=/usr/bin/docker rm apache1 (code=exited, status=1/FAILURE) Process: 1024 ExecStartPre=/usr/bin/docker kill apache1 (code=exited, status=1/FAILURE) Main PID: 1375 (docker); : 1522 (docker) CGroup: /system.slice/system-apache.slice/[email protected] ├─1375 /usr/bin/docker run --rm --name apache1 -p 80:80 coreos/apache /usr/sbin/apache2ctl -D FOREGROUND └─control └─1522 /usr/bin/docker stop apache1 Apr 15 18:43:18 core-01 docker[1038]: 9cd978db300e: Pulling fs layer Apr 15 18:44:26 core-01 docker[1038]: 9cd978db300e: Download complete Apr 15 18:44:26 core-01 docker[1038]: 87026dcb0044: Pulling metadata Apr 15 18:44:27 core-01 docker[1038]: 87026dcb0044: Pulling fs layer Apr 15 18:44:53 core-01 docker[1038]: 87026dcb0044: Download complete Apr 15 18:44:53 core-01 docker[1038]: 87026dcb0044: Download complete Apr 15 18:44:53 core-01 docker[1038]: Status: Downloaded newer image for coreos/apache:latest Apr 15 18:44:53 core-01 systemd[1]: Started "Dummy Apache container". Apr 15 18:44:53 core-01 docker[1375]: apache2: Could not reliably determine the server's fully qualified domain name, using 10.1.0.2 for ServerName Apr 15 18:45:46 core-01 systemd[1]: Stopping "Dummy Apache container"...
但是,几秒钟后:
core@core-01 ~ $ fleetctl status apache@2 ● [email protected] - "Dummy Apache container" Loaded: loaded (/run/fleet/units/[email protected]; linked-runtime; vendor preset: disabled) Active: failed (Result: exit-code) since Wed 2015-04-15 18:45:56 UTC; 1s ago Process: 1522 ExecStop=/usr/bin/docker stop apache1 (code=exited, status=0/SUCCESS) Process: 1375 ExecStart=/usr/bin/docker run --rm --name apache1 -p 80:80 coreos/apache /usr/sbin/apache2ctl -D FOREGROUND (code=exited, status=137) Process: 1038 ExecStartPre=/usr/bin/docker pull coreos/apache (code=exited, status=0/SUCCESS) Process: 1030 ExecStartPre=/usr/bin/docker rm apache1 (code=exited, status=1/FAILURE) Process: 1024 ExecStartPre=/usr/bin/docker kill apache1 (code=exited, status=1/FAILURE) Main PID: 1375 (code=exited, status=137) Apr 15 18:44:53 core-01 docker[1038]: 87026dcb0044: Download complete Apr 15 18:44:53 core-01 docker[1038]: Status: Downloaded newer image for coreos/apache:latest Apr 15 18:44:53 core-01 systemd[1]: Started "Dummy Apache container". Apr 15 18:44:53 core-01 docker[1375]: apache2: Could not reliably determine the server's fully qualified domain name, using 10.1.0.2 for ServerName Apr 15 18:45:46 core-01 systemd[1]: Stopping "Dummy Apache container"... Apr 15 18:45:56 core-01 docker[1522]: apache1 Apr 15 18:45:56 core-01 systemd[1]: [email protected]: main process exited, code=exited, status=137/n/a Apr 15 18:45:56 core-01 systemd[1]: Stopped "Dummy Apache container". Apr 15 18:45:56 core-01 systemd[1]: Unit [email protected] entered failed state. Apr 15 18:45:56 core-01 systemd[1]: [email protected] failed.
从我所看到的,似乎ExecStartPre正在杀死容器(这是ExecStartPre命令在服务启动时应该做的事情)。 我也用journalctl调查了这个日志,而且似乎是这样。
更正:经过进一步的检查(和更接近屏幕的眼睛),我注意到日志中这个非常重要的行:
Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="Container afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1 failed to exit within 10 seconds of SIGTERM -
所以我可以看到,docker无法正常停止服务(我不明白的原因),但我仍然不明白为什么它会继续前进,并试图摧毁集装箱。 以下是其他相关日志:
Apr 15 18:45:46 core-01 systemd[1]: Stopping "Dummy Apache container"... Apr 15 18:45:46 core-01 fleetd[880]: INFO manager.go:145: Triggered systemd unit [email protected] stop: job=2115 Apr 15 18:45:46 core-01 fleetd[880]: INFO reconcile.go:311: AgentReconciler completed task: type=StopUnit [email protected] reason="unit currently launched but desired state is loaded" Apr 15 18:45:46 core-01 dockerd[881]: time="2015-04-15T18:45:46Z" level="info" msg="POST /v1.17/containers/apache1/stop?t=10" Apr 15 18:45:46 core-01 dockerd[881]: time="2015-04-15T18:45:46Z" level="info" msg="+job stop(apache1)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="Container afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1 failed to exit within 10 seconds of SIGTERM - Apr 15 18:45:56 core-01 kernel: docker0: port 1(veth87a841f) entered disabled state Apr 15 18:45:56 core-01 kernel: device veth87a841f left promiscuous mode Apr 15 18:45:56 core-01 kernel: docker0: port 1(veth87a841f) entered disabled state Apr 15 18:45:56 core-01 systemd-networkd[830]: veth87a841f : lost carrier Apr 15 18:45:56 core-01 systemd-networkd[830]: veth87a841f : could not find udev device: No such device Apr 15 18:45:56 core-01 systemd-networkd[830]: docker0 : lost carrier Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="+job log(die, afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1, coreos/apache:latest)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="-job log(die, afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1, coreos/apache:latest) = OK (0)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="+job release_interface(afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="-job attach(afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1) = OK (0)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="POST /v1.17/containers/afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1/wait" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="+job wait(afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="-job release_interface(afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1) = OK (0)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="+job log(stop, afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1, coreos/apache:latest)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="-job log(stop, afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1, coreos/apache:latest) = OK (0)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="-job stop(apache1) = OK (0)" Apr 15 18:45:56 core-01 docker[1522]: apache1 Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="-job wait(afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1) = OK (0)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="GET /v1.17/containers/afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1/json" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="+job container_inspect(afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="-job container_inspect(afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1) = OK (0)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="DELETE /v1.17/containers/afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1?v=1" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="+job rm(afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="POST /v1.17/containers/afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1/kill?signal=TERM" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="+job kill(afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1, TERM)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="+job log(destroy, afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1, coreos/apache:latest)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="-job log(destroy, afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1, coreos/apache:latest) = OK (0)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="-job rm(afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1) = OK (0)" Apr 15 18:45:56 core-01 systemd[1]: [email protected]: main process exited, code=exited, status=137/n/a Apr 15 18:45:56 core-01 systemd[1]: Stopped "Dummy Apache container". Apr 15 18:45:56 core-01 systemd[1]: Unit [email protected] entered failed state. Apr 15 18:45:56 core-01 systemd[1]: [email protected] failed. Apr 15 18:45:56 core-01 dockerd[881]: No such container: afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1 Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="info" msg="-job kill(afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1, TERM) = ERR (1)" Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="error" msg="Handler for POST /containers/{name:.*}/kill returned error: No such container: afa4e52d8ff2edaa53d2bae1177535d669a4758f Apr 15 18:45:56 core-01 dockerd[881]: time="2015-04-15T18:45:56Z" level="error" msg="HTTP Error: statusCode=404 No such container: afa4e52d8ff2edaa53d2bae1177535d669a4758f1cc524056ba55a9da383ffd1"
我不太确定这里发生了什么,所以任何帮助,非常感谢。
更新:为了testing的缘故,我增加了服务文件( TimeoutStopSec = 300 docker stop -t 300 apache1和TimeoutStopSec = 300 )中的超时时间,但是即使等待5分钟,容器也无法停止。 可以肯定的是,我试图直接在命令行( docker stop apache1 )停止容器,当然,它工作得很好。 所以似乎有些东西可以防止集装箱通过fleetctl正常fleetctl 。
谢谢你的时间!
docker停止容器的原因是因为容器中的apache失败。
你确定它的工作是完美的吗?
我会说最可能的问题是,它没有运行,但运行然后停止并重新启动。 很可能是因为80端口正在使用,因为您在所有情况下都将其绑定到主机端口80。 至less有两个人会与其他人冲突,因为你在同一个主机上运行stream浪汉。 如果单元文件在单独的机器上,并且没有使用端口80,则这不会成为问题。
为了解决这个问题,请使用-P代替。 然后在每台机器上,你将不得不运行docker ps来查看外部实际使用的端口。