Monit不会运行

我有两个相同的EC2实例(第二个是第一个副本),运行Gentoo。 一审具有监控单个进程的监控function,一些系统资源和function很好。

在第二种情况下,monit运行,但立即退出。 两个实例的configuration都是类似的,monit的版本也是如此。

monit.log显示:

[GMT Oct 3 08:36:41] info : monit daemon with PID 5 awakened 

strace monit最后一行显示:

 write(2, "monit daemon with PID 5 awakened"..., 33monit daemon with PID 5 awakened ) = 33 time(NULL) = 1349252827 open("/etc/localtime", O_RDONLY) = 4 fstat64(4, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 fstat64(4, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb773a000 read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0"..., 4096) = 118 _llseek(4, -6, [112], SEEK_CUR) = 0 read(4, "\nGMT0\n", 4096) = 6 close(4) = 0 munmap(0xb773a000, 4096) = 0 write(3, "[GMT Oct 3 08:27:07] info :"..., 33) = 33 write(3, "monit daemon with PID 5 awakened"..., 33) = 33 waitpid(-1, NULL, WNOHANG) = -1 ECHILD (No child processes) close(3) = 0 exit_group(0) = ? 

没有核心转储( ulimit -c显示unlimited

monit -v显示:

 monit: Debug: Adding host allow 'localhost' monit: Debug: Skipping redundant host 'localhost' monit: Debug: Skipping redundant host 'localhost' monit: Debug: Adding credentials for user 'xxxx'. Runtime constants: Control file = /etc/monitrc Log file = /var/log/monit/monit.log Pid file = /var/run/monit.pid Id file = /var/run/monit.pid Debug = True Log = True Use syslog = False Is Daemon = True Use process engine = True Poll time = 30 seconds with start delay 0 seconds Expect buffer = 256 bytes Event queue = base directory /var/monit with 100 slots Mail server(s) = xx.xxx.xx.xxx with timeout 30 seconds Mail from = (not defined) Mail subject = (not defined) Mail message = (not defined) Start monit httpd = True httpd bind address = Any/All httpd portnumber = 2812 httpd signature = True Use ssl encryption = False httpd auth. style = Basic Authentication and Host/Net allow list Alert mail to = [email protected] Alert on = All events The service list contains the following entries: System Name = xxxx Monitoring mode = active CPU wait limit = if greater than 20.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert CPU system limit = if greater than 30.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert CPU user limit = if greater than 70.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Swap usage limit = if greater than 25.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Memory usage limit = if greater than 75.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Load avg. (5min) = if greater than 2.0 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Load avg. (1min) = if greater than 4.0 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Process Name = xxxx Group = server Pid file = /var/run/xxxx.pid Monitoring mode = active Start program = '/etc/init.d/xxxx restart' timeout 20 second(s) Stop program = '/etc/init.d/xxxx stop' timeout 30 second(s) Existence = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert Pid = if changed 1 times within 1 cycle(s) then alert Ppid = if changed 1 times within 1 cycle(s) then alert Timeout = If restarted 3 times within 5 cycle(s) then unmonitor Alert mail to = [email protected] Alert on = All events Alert mail to = [email protected] Alert on = All events ------------------------------------------------------------------------------- monit daemon with PID 5 awakened 

Ran emerge --sync之前的emerge --sync emerge -va monit monit安装了monit v5.3.2。 当这不起作用,我已经从他们的网站下载v5.5,并从源代码编译也没有工作。

找出问题所在:

文件/var/run/monit.pid包含一些奇怪的信息,由于某种原因,它看起来像一个md5散列。

一旦我删除这个文件,并重新启动监视,一切正常。 是什么提醒了我的怀疑是monit daemon with PID 5 awakened线monit daemon with PID 5 awakened ,这是非常奇怪的,因为新运行的进程应该有一个更高的PID,所以我去检查.pid文件。

我已经在EBS支持的EC2实例(使用AWS的标准Linux版本)上加载了Monit。

由于我的configuration错误,Monit在启动/重启时对我来说是不可靠的。 在configuration文件/etc/monit.conf中,我有以下行,这是不正确的:

 set idfile /var/run/monit.pid 

应该是这样的

 set idfile /var/run/.monit.id 

有人可能会发现这对未来有帮助。