在我的一台服务器上,我发现真的延迟了SSHlogin。
使用ssh -vvv选项进行连接时,在debug1: Entering interactive session.
发生延迟debug1: Entering interactive session.
连接摘录:
debug1: Authentication succeeded (publickey). Authenticated to IP_REDACTED ([IP_REDACTED]:22). debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug1: Requesting [email protected] debug1: Entering interactive session. debug2: callback start debug2: fd 3 setting TCP_NODELAY debug3: packet_set_tos: set IP_TOS 0x10 debug2: client_session2_setup: id 0 debug2: channel 0: request pty-req confirm 1
使用这里概述的方法,我生成strace输出,并注意到行14:09:53.676004 ppoll([{fd=5, events=POLLIN}], 1, {24, 999645000}, NULL, 8) = 1 ([{fd=5, revents=POLLIN}], left {0, 0}) <25.020764>
,这需要25秒。
strace输出的摘录:
14:09:53.675567 clock_gettime(CLOCK_MONOTONIC, {4662549, 999741404}) = 0 <0.000024> 14:09:53.675651 recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1\n\0\0\0\2\0\0\0\215\0\0\0\1\1o\0\25\0\0\0", 24}], msg_controll en=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24 <0.000024> 14:09:53.675744 recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{"/org/freedesktop/DBus\0\0\0\2\1s\0\24\0\0\0"..., 146}], msg_controllen =0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 146 <0.000025> 14:09:53.675842 recvmsg(5, 0x7ffe0ff1dfa0, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailab le) <0.000023> 14:09:53.675925 clock_gettime(CLOCK_MONOTONIC, {4662550, 96075}) = 0 <0.000024> 14:09:53.676004 ppoll([{fd=5, events=POLLIN}], 1, {24, 999645000}, NULL, 8) = 1 ([{fd=5, revents=POLLIN}], left {0, 0}) <25.020764> 14:10:18.696865 recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{"l\3\1\0013\0\0\0\3\0\0\0m\0\0\0\6\1s\0\5\0\0\0", 24}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 24 <0.000017> 14:10:18.696944 recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{":1.10\0\0\0\4\1s\0#\0\0\0org.freedesktop."..., 155}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = 155 <0.000018>
我注意到在相关时间在auth日志中有一个条目:
Jul 21 14:10:18 click sshd[8165]: pam_systemd(sshd:session): Failed to create session: Activation of org.freedesktop.login1 timed out
不知道这是什么,它试图进行轮询,为什么现在在这个特定的服务器上需要25秒。
journalctl -u systemd-logind
命令显示
Jul 20 11:33:06 click systemd-logind[19415]: Failed to abandon session scope: Transport endpoint is not connected Jul 21 05:04:54 myhost systemd[1]: Started Login Service. Jul 21 12:15:30 myhost systemd[1]: Started Login Service. Jul 21 12:17:04 myhost systemd[1]: Started Login Service. Jul 21 12:49:55 myhost systemd[1]: Started Login Service. Jul 21 13:57:05 myhost systemd[1]: Started Login Service. Jul 21 13:58:49 myhost systemd[1]: Started Login Service. Jul 21 14:01:55 myhost systemd[1]: Started Login Service. Jul 21 14:08:32 myhost systemd[1]: Started Login Service. Jul 21 14:09:53 myhost systemd[1]: Started Login Service. Jul 21 14:19:08 myhost systemd[1]: Started Login Service. Jul 21 14:21:26 myhost systemd[1]: Started Login Service. Jul 21 14:22:37 myhost systemd[1]: Started Login Service. Jul 21 14:25:20 myhost systemd[1]: Started Login Service. Jul 21 14:30:27 myhost systemd[1]: Started Login Service. Jul 21 15:02:56 myhost systemd[1]: Started Login Service.
发出命令systemctl restart systemd-logind.service
修复它(现在可能)。
它提到Activation of org.freedesktop.login1
的Activation of org.freedesktop.login1
什么? 有没有办法可以防止将来重新启动logind? 我预计随着时间的推移,我将对我pipe理的其他服务器产生这个问题。
只是注意到这开始发生在另一台服务器上。
$ sudo service systemd-logind status ● systemd-logind.service - Login Service Loaded: loaded (/lib/systemd/system/systemd-logind.service; static) Active: active (running) since Tue 2015-06-16 14:10:57 BST; 1 months 12 days ago Docs: man:systemd-logind.service(8) man:logind.conf(5) http://www.freedesktop.org/wiki/Software/systemd/logind http://www.freedesktop.org/wiki/Software/systemd/multiseat Main PID: 1701 (systemd-logind) Status: "Processing requests..." CGroup: /system.slice/systemd-logind.service └─1701 /lib/systemd/systemd-logind Jul 28 13:16:21 myhost systemd[1]: Started Login Service. Jul 28 13:16:47 myhost systemd[1]: Started Login Service. Jul 28 16:09:23 myhost systemd[1]: Started Login Service. Jul 28 16:09:49 myhost systemd[1]: Started Login Service. Jul 28 16:10:15 myhost systemd[1]: Started Login Service. Jul 28 16:10:41 myhost systemd[1]: Started Login Service. Jul 28 22:50:19 myhost systemd[1]: Started Login Service. Jul 29 05:00:15 myhost systemd[1]: Started Login Service. Jul 29 11:00:20 myhost systemd[1]: Started Login Service. Jul 29 11:09:56 myhost systemd[1]: Started Login Service.
编辑 – 扩大journalctl
输出。
编辑2 – 添加systemd-logind状态,如注意到在另一台服务器上注意到这一点时所build议的。
更新 – 这是开始发生在我的杰西服务器的其余部分。 我是唯一一个遇到这个? 除了重新启动systemd-logind之外,还有一些解决方法,有没有人有任何想法?
这个770135有一个Debian的bug报告。
dbus重新启动时会发生这种情况,但systemd-logind不会重新启动。 只要做到以下几点:
systemctl restart systemd-logind
解决scheme是从这里: https : //major.io/2015/07/27/very-slow-ssh-logins-on-fedora-22/
使用:
systemctl restart systemd-logind
只是暂时解决问题。
一种解决方法是从cron作业中删除所有.scope
文件,如此处所述。
* 2,14 * * * root /bin/rm -f /run/systemd/system/*.scope
相关的systemd缺陷报告在这里: 示波器泄漏“systemctl list-unit-files”并延迟login 。
它似乎实际上是一个dbus错误: 在dbus版本1.11.10中解决了 unix fd中的计数中断
为了永久解决这个bug,你只需要等待这个版本的dbus出现在你的发行版中。 目前,Debian Stretch在dbus 1.10.18,Ubuntu 17.04(Zesty)在1.10.10,CentOS 7在dbus 1.6.12。