我有一个在DigitalOcean上没有问题的最小云configuration。 我为SSH添加了一些强化,这需要重新启动sshd.socket才能生效:
units: - name: sshd.socket command: restart
单独添加本机(没有实际的sshdconfiguration更改)导致设置使用相同的cloud-config在Hetzner上尝试时失败: ssh: connect to host xx.xx.xx.xx port 22: Connection refused 。 它在DigitalOcean上仍然连接良好。
当我拆下这个单元,然后连接到Hetzner机器工作正常,再次添加失败一致。
我所知道的两个平台之间的唯一区别是DigitalOcean上的variables$public_ipv4和$private_ipv4被replace为实际的IP地址,在Hetzner这样的裸机上并不是这样。
从CoreOS文档 :
注意:其他文档中引用的$ private_ipv4和$ public_ipv4replacevariables仅在Amazon EC2,Google Compute Engine,OpenStack,Rackspace,DigitalOcean和Vagrant上受支持。
所以我用静态IP地址replacevariables。 我使用公共IP地址,因为除了环回之外,这是唯一可用的接口。
但是,当我提供而不用公共IP地址replace这些variables时,它也可以很好地连接。
检查期刊揭示了一些与名称parsing相关的错误:
systemd[1]: Starting etcd2... etcd2[874]: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=http://:2379,http://:4001 etcd2[874]: recognized and used environment variable ETCD_DATA_DIR=/var/lib/etcd2 etcd2[874]: recognized and used environment variable ETCD_DISCOVERY=https://discovery.etcd.io/616b3957c5c78e7738207011f9c51841 etcd2[874]: recognized and used environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS=http://:2380 etcd2[874]: recognized and used environment variable ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379,http://0.0.0.0:4001 etcd2[874]: recognized and used environment variable ETCD_LISTEN_PEER_URLS=http://:2380 etcd2[874]: recognized and used environment variable ETCD_NAME=39b2a003672546f8a0b648dbc66e8f6f etcd2[874]: etcd Version: 2.2.0 etcd2[874]: Git SHA: e4561dd etcd2[874]: Go Version: go1.4.2 etcd2[874]: Go OS/Arch: linux/amd64 etcd2[874]: setting maximum number of CPUs to 1, total number of available CPUs is 12 etcd2[874]: listening for peers on http://:2380 etcd2[874]: listening for client requests on http://0.0.0.0:2379 etcd2[874]: listening for client requests on http://0.0.0.0:4001 etcd2[874]: resolving :2380 to :2380 etcd2[874]: resolving :2380 to :2380 etcd2[874]: error #0: dial tcp: lookup discovery.etcd.io: Temporary failure in name resolution etcd2[874]: cluster status check: error connecting to https://discovery.etcd.io, retrying in 2s etcd2[874]: error #0: dial tcp: lookup discovery.etcd.io: Temporary failure in name resolution etcd2[874]: cluster status check: error connecting to https://discovery.etcd.io, retrying in 4s etcd2[874]: found self 61dbc8c9c2aca1e8 in the cluster etcd2[874]: found 1 needed peer(s)
但是它们似乎并不致命: systemctl status etcd2.service显示服务处于活动状态:
core@localhost ~ $ systemctl status etcd2.service ● etcd2.service - etcd2 Loaded: loaded (/usr/lib64/systemd/system/etcd2.service; disabled; vendor preset: disabled) Drop-In: /run/systemd/system/etcd2.service.d └─20-cloudinit.conf Active: active (running) since Tue 2016-03-22 14:10:33 UTC; 7min ago Main PID: 874 (etcd2) Memory: 20.3M CPU: 1.771s CGroup: /system.slice/etcd2.service └─874 /usr/bin/etcd2 etcd2[874]: added local member 61dbc8c9c2aca1e8 [http://:2380] to cluster 216c373aaf11ccfa systemd[1]: Started etcd2. etcd2[874]: 61dbc8c9c2aca1e8 is starting a new election at term 1 etcd2[874]: 61dbc8c9c2aca1e8 became candidate at term 2 etcd2[874]: 61dbc8c9c2aca1e8 received vote from 61dbc8c9c2aca1e8 at term 2 etcd2[874]: 61dbc8c9c2aca1e8 became leader at term 2 etcd2[874]: raft.node: 61dbc8c9c2aca1e8 elected leader 61dbc8c9c2aca1e8 at term 2 etcd2[874]: published {Name:39b2a003672546f8a0b648dbc66e8f6f ClientURLs:[http://:2379 http://:4001]} to cluster 216c373aaf11ccfa etcd2[874]: setting up the initial cluster version to 2.2 etcd2[874]: set the initial cluster version to 2.2
连接到其他服务(如Logstash)的容器失败: the scheme http does not accept registry part: :9200 (or bad hostname?)
这是一个精简的云configuration,但它仍然certificate了这个问题(validation)。
#cloud-config ssh_authorized_keys: - "ssh-rsa A valid SSH key here" write_files: coreos: etcd2: # NOTE: replace $discovery_url with a url generated at https://discovery.etcd.io/new?size=X discovery: $discovery_url listen-client-urls: http://0.0.0.0:2379,http://0.0.0.0:4001 advertise-client-urls: http://my.public.ip.address:2379,http://my.public.ip.address:4001 initial-advertise-peer-urls: http://my.public.ip.address:2380 listen-peer-urls: http://my.public.ip.address:2380 # Remove this flag or use localhost and the connection issue goes away units: - name: etcd2.service command: start - name: fleet.service command: start - name: sshd.socket command: restart # Remove this unit and all issues go away (but no SSH hardening in that case)
我注意到的一件事是,当我删除标志listen-peer-urls ,连接问题也会消失,尽pipelogstash仍然不会以相同的原因启动。
这个文件说这些标志的默认值是带有localhost URL,但是在像DigitalOcean这样的平台上使用的replacevariables的名字似乎表明这应该是对等机器可以访问的地址。
当我使用localhost的这些标志,我可以连接,但其他问题仍然存在。
对于仅具有公共和回送接口(无专用networking)的裸机,适当的云configuration应该是什么?
sshd和etcd之间的关系是什么造成了这个失败?
对于仅具有公共和回送接口(无专用networking)的裸机,适当的云configuration应该是什么?
插入机器的公共IP来代替这些variables。
sshd和etcd之间的关系是什么造成了这个失败?
你可以分享sshd日志吗? 为什么不开始?