debuggingKUBELET不启动

我正在使用kubeadm尝试设置一个开发人员。我遇到了一个问题，即kubelet的健康检查失败。我正在寻找如何debugging的方向。运行build议debugging的命令（ systemctl status kubelet ）没有看到错误的原因：

 kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: activating (auto-restart) (Result: exit-code) since Thu 2017-10-05 15:04:23 CDT; 4s ago Docs: http://kubernetes.io/docs/ Process: 4786 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_CERTIFICATE_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE) Main PID: 4786 (code=exited, status=1/FAILURE) Oct 05 15:04:23 master.domain..com systemd[1]: Unit kubelet.service entered failed state. Oct 05 15:04:23 master.domain.com systemd[1]: kubelet.service failed.

我在哪里可以find一个特定的错误消息，指出为什么这不运行？

在运行swapoff -a来禁用swap之后，我仍然无法configurationKubernetes。

以下是kubeadm init的完整输出：

 $ kubeadm init [kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters. [init] Using Kubernetes version: v1.8.2 [init] Using Authorization modes: [Node RBAC] [preflight] Running pre-flight checks [preflight] WARNING: docker version is greater than the most recently validated version. Docker version: 17.09.0-ce. Max validated version: 17.03 [preflight] Starting the kubelet service [kubeadm] WARNING: starting in 1.8, tokens expire after 24 hours by default (if you require a non-expiring token use --token-ttl 0) [certificates] Generated ca certificate and key. [certificates] Generated apiserver certificate and key. [certificates] apiserver serving cert is signed for DNS names [master.my-domain.com kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.xx.xx.xx 10.xx.xx.xx] [certificates] Generated apiserver-kubelet-client certificate and key. [certificates] Generated sa key and public key. [certificates] Generated front-proxy-ca certificate and key. [certificates] Generated front-proxy-client certificate and key. [certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki" [kubeconfig] Wrote KubeConfig file to disk: "admin.conf" [kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf" [kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf" [kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf" [controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml" [controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml" [controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml" [etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml" [init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests" [init] This often takes around a minute; or longer if the control plane images have to be pulled. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp 127.0.0.1:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp 127.0.0.1:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp 127.0.0.1:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp 127.0.0.1:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp 127.0.0.1:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp 127.0.0.1:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp 127.0.0.1:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz/syncloop' failed with error: Get http://localhost:10255/healthz/syncloop: dial tcp 127.0.0.1:10255: getsockopt: connection refused. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp 127.0.0.1:10255: getsockopt: connection refused. Unfortunately, an error has occurred: timed out waiting for the condition This error is likely caused by that: - The kubelet is not running - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) - There is no internet connection; so the kubelet can't pull the following control plane images: - gcr.io/google_containers/kube-apiserver-amd64:v1.8.2 - gcr.io/google_containers/kube-controller-manager-amd64:v1.8.2 - gcr.io/google_containers/kube-scheduler-amd64:v1.8.2 You can troubleshoot this for example with the following commands if you're on a systemd-powered system: - 'systemctl status kubelet' - 'journalctl -xeu kubelet' couldn't initialize a Kubernetes cluster

我也尝试删除docker仓库并安装不可运行的Docker 1.12 – Error starting daemon: SELinux is not supported with the overlay graph driver on this kernel. Either boot into a newer kernel or disable selinux ... Error starting daemon: SELinux is not supported with the overlay graph driver on this kernel. Either boot into a newer kernel or disable selinux ...

通过在systemd脚本中设置–fail-swap-on = false来解决问题。只需要修改文件/etc/systemd/system/kubelet.service.d/10-kubeadm.conf

Environment=“KUBELET_SYSTEM_PODS_ARGS = – pod-manifest-path = / etc / kubernetes / manifests –allow-privileged = true –fail-swap-on = false”

然后运行systemctl守护进程重新加载然后systemctl重新启动kubelet

发现这个问题： https ： //github.com/kubernetes/kubernetes/issues/53333

之前的答案为我工作，但不是在关联问题提供的决议。

所以也许，按照他们编辑90-kubeadm.conf（取代10-kubeadm.conf）的build议是可行的

这个问题已经在Atom发布的问题中讨论过了，所以我不觉得我贡献了很多，但是如果打开swap，我可以复制你的问题。所以对我来说，解决方法是禁用交换并重试init：

 sudo -i swapoff -a kubeadm reset kubeadm init

dirtbag发布的答案也适用于我，但为了在systemctl daemon-reload之后安全起见，我做了一个完整的kubeadm reset和kubeadm init ，而不仅仅是systemctl restart kubelet 。

如果这不适合你，你可以在禁用swap之后粘贴kubeadm init的新输出吗？