在CoreOS上,etcd2不会通过systemctl启动

真的希望有人能帮助我。 我有一个CoreOS的etcd2成员的磁盘填满。 重启后etcd2的方式不好。 最后,我删除了/ var / lib / etcd2 /成员数据目录,并按照说明去除并重新将计算机添加到我的群集中: https : //coreos.com/etcd/docs/latest/runtime- configuration.html#删除-A-构件

但是,虽然我可以手动运行etcd2,但尝试使用systemctl启动它不起作用。 这是etcd2.service:

[Unit] Description=etcd2 Conflicts=etcd.service [Service] User=etcd Type=notify Environment=ETCD_DATA_DIR=/var/lib/etcd2 Environment=ETCD_NAME=%m ExecStart=/usr/bin/etcd2 Restart=always RestartSec=10s LimitNOFILE=40000 TimeoutStartSec=0 [Install] WantedBy=multi-user.target # /run/systemd/system/etcd2.service.d/10-oem.conf [Service] Environment=ETCD_ELECTION_TIMEOUT=1200 # /run/systemd/system/etcd2.service.d/20-cloudinit.conf [Service] Environment="ETCD_ADVERTISE_CLIENT_URLS=http://172.31.9.22:2379,http://172.31.9.22:4001" Environment="ETCD_DISCOVERY=https://discovery.etcd.io/567d080563e28e62cf886e48425f632b" Environment="ETCD_INITIAL_ADVERTISE_PEER_URLS=http://172.31.9.22:2380" Environment="ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379" Environment="ETCD_LISTEN_PEER_URLS=http://172.31.9.22:2380" Environment="ETCD_DEBUG=true" 

我添加了ETCD_DEBUG = true来尝试改进日志输出。 说到:

 Feb 26 07:23:36 geo-coreos-database-02 systemd[1]: Starting etcd2... Feb 26 07:23:36 geo-coreos-database-02 etcd2[2939]: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=http://172.31.9.22:2379,http://172.31.9.22:4001 Feb 26 07:23:36 geo-coreos-database-02 etcd2[2939]: recognized and used environment variable ETCD_DATA_DIR=/var/lib/etcd2 Feb 26 07:23:36 geo-coreos-database-02 etcd2[2939]: recognized and used environment variable ETCD_DEBUG=true Feb 26 07:23:36 geo-coreos-database-02 etcd2[2939]: recognized and used environment variable ETCD_DISCOVERY=https://discovery.etcd.io/567d080563e28e62cf886e48425f632b Feb 26 07:23:36 geo-coreos-database-02 etcd2[2939]: recognized and used environment variable ETCD_ELECTION_TIMEOUT=1200 Feb 26 07:23:36 geo-coreos-database-02 etcd2[2939]: recognized and used environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS=http://172.31.9.22:2380 Feb 26 07:23:36 geo-coreos-database-02 etcd2[2939]: recognized and used environment variable ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379 Feb 26 07:23:36 geo-coreos-database-02 etcd2[2939]: recognized and used environment variable ETCD_LISTEN_PEER_URLS=http://172.31.9.22:2380 Feb 26 07:23:37 geo-coreos-database-02 systemd[1]: etcd2.service: Main process exited, code=exited, status=1/FAILURE Feb 26 07:23:37 geo-coreos-database-02 systemd[1]: Failed to start etcd2. Feb 26 07:23:37 geo-coreos-database-02 systemd[1]: etcd2.service: Unit entered failed state. Feb 26 07:23:37 geo-coreos-database-02 systemd[1]: etcd2.service: Failed with result 'exit-code'. 

不是很有帮助。 但是,当我手动运行它,基于etcd2.service上的configuration,服务器启动,并在前台运行没有问题:

 export ETCD_NAME="e0a8edc41f634fcf9451b5c68e3442bd" export ETCD_DATA_DIR=/var/lib/etcd2 export ETCD_ADVERTISE_CLIENT_URLS=http://172.31.9.22:2379,http://172.31.9.22:4001 export ETCD_DISCOVERY=https://discovery.etcd.io/567d080563e28e62cf886e48425f632b export ETCD_INITIAL_ADVERTISE_PEER_URLS=http://172.31.9.22:2380 export ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379 export ETCD_LISTEN_PEER_URLS=http://172.31.9.22:2380 export ETCD_DEBUG=true etcd2 

这启动服务器,与我所期望的debugging,我甚至可以在服务器上运行etcdctl命令。 对于如何进一步debugging,我绝对不知所措。 我绝对不想为这样一个小小的东西创build一个新的集群,但是这总是看起来像etcd一样,这是唯一一个给我们造成问题的东西,他们往往是这样的,模糊的,难以修复的。

当然,我发表一个问题之后,我会直截了当的看出来。 当我第一次手动运行命令时,我以root用户的身份执行了这个命令。 因此,数据目录由root用户拥有,而不是由etcd用户拥有。 更改权限解决了问题。 仍然非常可怕,即使在debugging模式下,它也不logging

我正在使用云configuration,我不得不添加initial-cluster-state: existing我的云configuration。 如果没有这个initial-cluster-state是默认为new ,看来我的重新添加的节点正在拾取旧的GUID已被“永久删除”,并导致etcd2无法启动。

这里是没有initial-cluster-state: existing的日志日志initial-cluster-state: existing于我的cloud-config中:

 Jun 03 11:10:36 giscoreos2 systemd[1]: Stopped etcd2. Jun 03 11:10:36 giscoreos2 systemd[1]: Starting etcd2... Jun 03 11:10:36 giscoreos2 etcd2[19970]: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=http://giscoreos2.example.com:2379 Jun 03 11:10:36 giscoreos2 etcd2[19970]: recognized and used environment variable ETCD_DATA_DIR=/var/lib/etcd2 Jun 03 11:10:36 giscoreos2 etcd2[19970]: recognized and used environment variable ETCD_DISCOVERY_SRV=example-corp.us Jun 03 11:10:36 giscoreos2 etcd2[19970]: recognized and used environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS=http://giscoreos2.example.com:2380 Jun 03 11:10:36 giscoreos2 etcd2[19970]: recognized and used environment variable ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster-1 Jun 03 11:10:36 giscoreos2 etcd2[19970]: recognized and used environment variable ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379 Jun 03 11:10:36 giscoreos2 etcd2[19970]: recognized and used environment variable ETCD_LISTEN_PEER_URLS=http://giscoreos2.example.com:2380 Jun 03 11:10:36 giscoreos2 etcd2[19970]: recognized and used environment variable ETCD_NAME=giscoreos2 Jun 03 11:10:36 giscoreos2 etcd2[19970]: etcd Version: 2.3.1 Jun 03 11:10:36 giscoreos2 etcd2[19970]: Git SHA: 2b67f52 Jun 03 11:10:36 giscoreos2 etcd2[19970]: Go Version: go1.5.3 Jun 03 11:10:36 giscoreos2 etcd2[19970]: Go OS/Arch: linux/amd64 Jun 03 11:10:36 giscoreos2 etcd2[19970]: setting maximum number of CPUs to 16, total number of available CPUs is 16 Jun 03 11:10:36 giscoreos2 etcd2[19970]: the server is already initialized as member before, starting as etcd member... Jun 03 11:10:36 giscoreos2 etcd2[19970]: got bootstrap from DNS for etcd-server at http://giscoreos3.example.com:2380 Jun 03 11:10:36 giscoreos2 etcd2[19970]: got bootstrap from DNS for etcd-server at http://giscoreos1.example.com:2380 Jun 03 11:10:36 giscoreos2 etcd2[19970]: got bootstrap from DNS for etcd-server at http://giscoreos2.example.com:2380 Jun 03 11:10:36 giscoreos2 etcd2[19970]: listening for peers on http://giscoreos2.example.com:2380 Jun 03 11:10:36 giscoreos2 etcd2[19970]: listening for client requests on http://0.0.0.0:2379 Jun 03 11:10:36 giscoreos2 etcd2[19970]: name = giscoreos2 Jun 03 11:10:36 giscoreos2 etcd2[19970]: data dir = /var/lib/etcd2 Jun 03 11:10:36 giscoreos2 etcd2[19970]: member dir = /var/lib/etcd2/member Jun 03 11:10:36 giscoreos2 etcd2[19970]: heartbeat = 100ms Jun 03 11:10:36 giscoreos2 etcd2[19970]: election = 1000ms Jun 03 11:10:36 giscoreos2 etcd2[19970]: snapshot count = 10000 Jun 03 11:10:36 giscoreos2 etcd2[19970]: advertise client URLs = http://giscoreos2.example.com:2379 Jun 03 11:10:36 giscoreos2 etcd2[19970]: restarting member d5f2eb850214f772 in cluster e1013a21e485d6ec at commit index 3 Jun 03 11:10:36 giscoreos2 etcd2[19970]: d5f2eb850214f772 became follower at term 242 Jun 03 11:10:36 giscoreos2 etcd2[19970]: newRaft d5f2eb850214f772 [peers: [], term: 242, commit: 3, applied: 0, lastindex: 3, lastterm: 1] Jun 03 11:10:36 giscoreos2 etcd2[19970]: starting server... [version: 2.3.1, cluster version: to_be_decided] Jun 03 11:10:36 giscoreos2 systemd[1]: Started etcd2. Jun 03 11:10:36 giscoreos2 etcd2[19970]: failed to find member c7f5106228d2b8a7 in cluster e1013a21e485d6ec Jun 03 11:10:36 giscoreos2 etcd2[19970]: failed to find member c7f5106228d2b8a7 in cluster e1013a21e485d6ec Jun 03 11:10:36 giscoreos2 etcd2[19970]: added member 2350c0e41172376a [http://giscoreos1.example.com:2380] to cluster e1013a21e485d6ec Jun 03 11:10:36 giscoreos2 systemd[1]: Starting Network fabric for containers... Jun 03 11:10:36 giscoreos2 etcd2[19970]: the member has been permanently removed from the cluster Jun 03 11:10:36 giscoreos2 etcd2[19970]: the data-dir used by this member must be removed. Jun 03 11:10:36 giscoreos2 etcd2[19970]: streaming request ignored (ID mismatch got b584365ea9e04f4d want d5f2eb850214f772) Jun 03 11:10:36 giscoreos2 etcd2[19970]: streaming request ignored (ID mismatch got b584365ea9e04f4d want d5f2eb850214f772) Jun 03 11:10:36 giscoreos2 etcd2[19970]: added member c7f5106228d2b8a7 [http://giscoreos3.example.com:2380] to cluster e1013a21e485d6ec Jun 03 11:10:36 giscoreos2 etcd2[19970]: added local member d5f2eb850214f772 [http://giscoreos2.example.com:2380] to cluster e1013a21e485d6ec Jun 03 11:10:36 giscoreos2 etcd2[19970]: aborting publish because server is stopped 

这是我的initial-cluster-state: existing日志initial-cluster-state: existing于我的cloud-config中:

 Jun 03 11:19:02 giscoreos2 systemd[1]: Starting etcd2... Jun 03 11:19:02 giscoreos2 etcd2[20222]: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=http://giscoreos2.example.com:2379 Jun 03 11:19:02 giscoreos2 etcd2[20222]: recognized and used environment variable ETCD_DATA_DIR=/var/lib/etcd2 Jun 03 11:19:02 giscoreos2 etcd2[20222]: recognized and used environment variable ETCD_DISCOVERY_SRV=exammple-corp.us Jun 03 11:19:02 giscoreos2 etcd2[20222]: recognized and used environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS=http://giscoreos2.example.com:2380 Jun 03 11:19:02 giscoreos2 etcd2[20222]: recognized and used environment variable ETCD_INITIAL_CLUSTER_STATE=existing Jun 03 11:19:02 giscoreos2 etcd2[20222]: recognized and used environment variable ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster-1 Jun 03 11:19:02 giscoreos2 etcd2[20222]: recognized and used environment variable ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379 Jun 03 11:19:02 giscoreos2 etcd2[20222]: recognized and used environment variable ETCD_LISTEN_PEER_URLS=http://giscoreos2.example.com:2380 Jun 03 11:19:02 giscoreos2 etcd2[20222]: recognized and used environment variable ETCD_NAME=giscoreos2 Jun 03 11:19:02 giscoreos2 etcd2[20222]: etcd Version: 2.3.1 Jun 03 11:19:02 giscoreos2 etcd2[20222]: Git SHA: 2b67f52 Jun 03 11:19:02 giscoreos2 etcd2[20222]: Go Version: go1.5.3 Jun 03 11:19:02 giscoreos2 etcd2[20222]: Go OS/Arch: linux/amd64 Jun 03 11:19:02 giscoreos2 etcd2[20222]: setting maximum number of CPUs to 16, total number of available CPUs is 16 Jun 03 11:19:02 giscoreos2 etcd2[20222]: got bootstrap from DNS for etcd-server at http://giscoreos1.example.com:2380 Jun 03 11:19:02 giscoreos2 etcd2[20222]: got bootstrap from DNS for etcd-server at http://giscoreos2.example.com:2380 Jun 03 11:19:02 giscoreos2 etcd2[20222]: got bootstrap from DNS for etcd-server at http://giscoreos3.example.com:2380 Jun 03 11:19:02 giscoreos2 etcd2[20222]: listening for peers on http://giscoreos2.example.com:2380 Jun 03 11:19:02 giscoreos2 etcd2[20222]: listening for client requests on http://0.0.0.0:2379 Jun 03 11:19:02 giscoreos2 etcd2[20222]: resolving giscoreos1.example.com:2380 to 10.240.160.152:2380 Jun 03 11:19:02 giscoreos2 etcd2[20222]: resolving giscoreos1.example.com:2380 to 10.240.160.152:2380 Jun 03 11:19:02 giscoreos2 etcd2[20222]: resolving giscoreos2.example.com:2380 to 10.240.160.57:2380 Jun 03 11:19:02 giscoreos2 etcd2[20222]: resolving giscoreos2.example.com:2380 to 10.240.160.57:2380 Jun 03 11:19:02 giscoreos2 etcd2[20222]: resolving giscoreos3.example.com:2380 to 10.240.160.6:2380 Jun 03 11:19:02 giscoreos2 etcd2[20222]: resolving giscoreos3.example.com:2380 to 10.240.160.6:2380 Jun 03 11:19:02 giscoreos2 etcd2[20222]: name = giscoreos2 Jun 03 11:19:02 giscoreos2 etcd2[20222]: data dir = /var/lib/etcd2 Jun 03 11:19:02 giscoreos2 etcd2[20222]: member dir = /var/lib/etcd2/member Jun 03 11:19:02 giscoreos2 etcd2[20222]: heartbeat = 100ms Jun 03 11:19:02 giscoreos2 etcd2[20222]: election = 1000ms Jun 03 11:19:02 giscoreos2 etcd2[20222]: snapshot count = 10000 Jun 03 11:19:02 giscoreos2 etcd2[20222]: advertise client URLs = http://giscoreos2.example.com:2379 Jun 03 11:19:02 giscoreos2 etcd2[20222]: starting member b584365ea9e04f4d in cluster e1013a21e485d6ec Jun 03 11:19:02 giscoreos2 etcd2[20222]: b584365ea9e04f4d became follower at term 0 Jun 03 11:19:02 giscoreos2 etcd2[20222]: newRaft b584365ea9e04f4d [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0] Jun 03 11:19:02 giscoreos2 etcd2[20222]: b584365ea9e04f4d became follower at term 1 Jun 03 11:19:02 giscoreos2 etcd2[20222]: the connection with 2350c0e41172376a became active Jun 03 11:19:02 giscoreos2 etcd2[20222]: starting server... [version: 2.3.1, cluster version: to_be_decided] Jun 03 11:19:02 giscoreos2 systemd[1]: Started etcd2.