有没有办法知道为什么服务重新启动,谁做的?

  • Ubuntu 14.04
  • 克拉马夫0.98.7

问题是clamav-daemon几乎每天都重启:

 Sep 1 06:30:00 x-master clamd[6778]: Pid file removed. clamd[6778]: --- Stopped at Tue Sep 1 06:30:00 2015 clamd[5979]: clamd daemon 0.98.7 (OS: linux-gnu, ARCH: x86_64, CPU: x86_64) clamd[5979]: Running as user root (UID 0, GID 0) clamd[5979]: Log file size limited to 4294967295 bytes. clamd[5979]: Reading databases from /var/lib/clamav clamd[5979]: Not loading PUA signatures. clamd[5979]: Bytecode: Security mode set to "TrustSigned". 

如果clamdscan正在运行, clamdscan造成问题:

 /etc/cron.daily/clamav_scan: ERROR: Could not connect to clamd on xxxx: Connection refused 

请注意,我在开头几乎是这样说的:

 /var/log/syslog:Sep 1 06:30:00 x-master clamd[6778]: Pid file removed. /var/log/syslog.1:Aug 31 06:27:54 x-master clamd[20128]: Pid file removed. /var/log/syslog.4.gz:Aug 28 06:28:34 x-master clamd[4475]: Pid file removed. /var/log/syslog.5.gz:Aug 27 06:27:47 x-master clamd[21466]: Pid file removed. 

如你看到的:

  • 8月29日和30日没有发生
  • 它通常在cron.daily正在运行的时间cron.daily左右重新启动

     27 6 * * * root nice -n 19 ionice -c3 run-parts --report /etc/cron.daily 

/etc/cron.daily/clamav_scan的内容:

 find / $exclude_string ! \( -path "/tmp/clamav-*.tmp" -prune \) ! \( -path "/var/lib/elasticsearch" -prune \) ! \( -path "/var/lib/mongodb" -prune \) ! \( -path "/var/lib/graylog-server" -prune \) -mtime -1 -type f -print0 | xargs -0 clamdscan --quiet -l "$status_file" || retval=$? 

有一个clamav-daemon的logrotate文件:

 /var/log/clamav/clamav.log { rotate 12 weekly compress delaycompress create 640 clamav adm postrotate /etc/init.d/clamav-daemon reload-log > /dev/null endscript } 

但它只是重新加载日志:

 Sep 1 02:30:24 uba-master clamd[6778]: SIGHUP caught: re-opening log file. 

我知道我们可以使用auditd来监视二进制文件,这里是一个例子日志:

 ausearch -f /usr/sbin/clamd [2/178] ---- time->Tue Sep 1 07:56:44 2015 type=PATH msg=audit(1441094204.559:15): item=1 name=(null) inode=2756458 dev=fc:00 mode=0100755 ouid=0 ogid=0 rdev=00:00 type=PATH msg=audit(1441094204.559:15): item=0 name="/usr/sbin/clamd" inode=3428628 dev=fc:00 mode=0100755 ouid=0 ogid=0 rdev=00:00 type=CWD msg=audit(1441094204.559:15): cwd="/" type=EXECVE msg=audit(1441094204.559:15): argc=1 a0="/usr/sbin/clamd" type=SYSCALL msg=audit(1441094204.559:15): arch=c000003e syscall=59 success=yes exit=0 a0=7ffd277e03dc a1=7ffd277dfa78 a2=7ffd277dfa88 a3=7ffd277df570 items=2 ppid=5708 pid=5946 auid=4294967295 uid=109 gid=114 euid=109 suid=109 fsuid=109 egid=114 sgid=114 fsgid=114 tty=pts1 ses=4294967295 comm="clamd" exe="/usr/sbin/clamd" key=(null) 

109是… clamav用户的UID:

 getent passwd clamav clamav:x:109:114::/var/lib/clamav:/bin/false 

在这种情况下是否有另一种排除故障的方法?


回复@HBruijn:

更新AV定义后可能freshclam?

我想过这个问题。 这里是日志:

 Sep 1 05:31:04 x-master freshclam[16197]: Received signal: wake up Sep 1 05:31:04 x-master freshclam[16197]: ClamAV update process started at Tue Sep 1 05:31:04 2015 Sep 1 05:31:04 x-master freshclam[16197]: main.cvd is up to date (version: 55, sigs: 2424225, f-level: 60, builder: neo) Sep 1 05:31:05 x-master freshclam[16197]: Downloading daily-20865.cdiff [100%] Sep 1 05:31:09 x-master freshclam[16197]: daily.cld updated (version: 20865, sigs: 1555338, f-level: 63, builder: neo) Sep 1 05:31:10 x-master freshclam[16197]: bytecode.cvd is up to date (version: 268, sigs: 47, f-level: 63, builder: anvilleg) Sep 1 05:31:13 x-master freshclam[16197]: Database updated (3979610 signatures) from db.local.clamav.net (IP: 168.143.19.95) Sep 1 05:31:13 x-master freshclam[16197]: Clamd successfully notified about the update. Sep 1 05:31:13 x-master freshclam[16197]: -------------------------------------- Sep 1 04:34:10 x-master clamd[6778]: SelfCheck: Database status OK. Sep 1 05:31:13 x-master clamd[6778]: Reading databases from /var/lib/clamav Sep 1 05:31:22 x-master clamd[6778]: Database correctly reloaded (3974071 signatures) 

我不确定这一点,但看起来像freshclam有一个“内部机制”通知clamd关于更新。 之后它可以重新加载数据库,不需要重新启动进程。 你可否确认?

而且,从时间戳,我看到clamav-daemon在freshclam更新数据库一个小时后重新启动。 这是正常的吗?


更新周二9月1日22:10:49 ICT 2015

但看起来像freshclam有一个“内部机制”通知clamd关于更新。 之后它可以重新加载数据库,不需要重新启动进程。

我可以通过testing来证实这是正确的:

  • 编辑freshclam.conf文件将间隔改为几分钟( Checks 1440
  • 重新启动clamav-freshclam
  • cd / var / lib / clamav
  • rm daily.cvd
  • 等一分钟

     Sep 1 14:49:25 p freshclam[7654]: Downloading daily.cvd [100%] Sep 1 14:49:28 p freshclam[7654]: daily.cvd updated (version: 19487, sigs: 1191913, f-level: 63, builder: neo) Sep 1 14:49:28 p freshclam[7654]: Reading CVD header (bytecode.cvd): Sep 1 14:49:28 p freshclam[7654]: OK Sep 1 14:49:28 p freshclam[7654]: bytecode.cvd is up to date (version: 245, sigs: 43, f-level: 63, builder: dgoddard) Sep 1 14:49:31 p freshclam[7654]: Database updated (3616181 signatures) from clamav.local (IP: 10.0.2.2) Sep 1 14:49:31 p freshclam[7654]: Clamd successfully notified about the update. Sep 1 14:49:31 p freshclam[7654]: -------------------------------------- Sep 1 14:49:32 p clamd[6693]: Reading databases from /var/lib/clamav Sep 1 14:49:39 p clamd[6693]: Database correctly reloaded (3610621 signatures) 

并且clamav-daemon没有重启。

请检查您是否使用任何configurationpipe理系统,例如Puppet,Chef,CFEngine等,他们可能会定期干扰服务。 为了纠正这个问题需要采取的确切行动取决于configurationpipe理系统中服务的使用方式。

请注意我自己。

作业caching的输出:

 ---------- ID: clamav-daemon Function: service.running Result: True Comment: Service restarted Started: 06:27:52.736890 Duration: 12997.632 ms Changes: ---------- clamav-daemon: True 

看看clamav公式:

  clamav-daemon: service: - running - order: 50 - require: - service: clamav-freshclam - watch: - pkg: clamav-daemon - file: clamav-daemon - user: clamav 

watch状态没有改变:

 ---------- ID: clamav-daemon Function: pkg.latest Result: True Comment: Package clamav-daemon is already up-to-date. Started: 06:27:51.531415 Duration: 53.224 ms Changes: ---------- ID: clamav-daemon Function: file.managed Name: /etc/clamav/clamd.conf Result: True Comment: File /etc/clamav/clamd.conf is in the correct state Started: 06:27:51.760019 Duration: 625.075 ms Changes: ---------- ID: clamav Function: user.present Result: True Comment: User clamav is present and up to date Started: 06:27:51.590214 Duration: 2.455 ms Changes: 

为什么服务已经重新启动?

寻找watch_in ,我发现了一个pipe理pid文件的状态,如果pid文件发生变化,服务将会重新启动:

 {%- macro manage_pid(path, user, group, watch_in_service, mode=644) -%} {%- if salt['file.file_exists'](path) %} {{ path }}: file: - managed - user: {{ user }} - group: {{ group }} - mode: {{ mode }} - replace: False {%- if caller is defined -%} {%- for line in caller().split("\n") -%} {%- if loop.first %} - require: {%- endif %} {{ line|trim|indent(6, indentfirst=True) }} {%- endfor -%} {%- endif %} - watch_in: - service: {{ watch_in_service }} {%- else %} # {{ path }} does not exist, no need to manage {%- endif -%} {%- endmacro -%} {%- call manage_pid('/var/run/clamav/clamd.pid', 'clamav', 'clamav', 'clamav-daemon', 664) %} - pkg: clamav-daemon {%- endcall %} 

salt-run jobs.lookup_jid <job id number>的输出中,我看到了这个:

 ---------- ID: /var/run/clamav/clamd.pid Function: file.managed Result: True Comment: Started: 06:27:52.392555 Duration: 2.364 ms Changes: ---------- group: clamav user: clamav 

所以,该pid文件的所有者/组已被更改为clamav 。 最后,我发现原因是clamav守护进程以root用户身份在networking模式下运行。 因此,pid文件是以root身份创build的。 因此,pipe理pid文件的状态必须更改为如下所示:

 {%- call manage_pid('/var/run/clamav/clamd.pid', 'root', 'root', 'clamav-daemon', 664) %} - pkg: clamav-daemon {%- endcall %}