因此,在运行Plesk 10的RHEL5服务器上,没有任何明显的原因。一天早上我醒来发现在此框上托pipe的所有站点都处于脱机状态。 SSH'd并重新启动httpd –
Stopping httpd: [ OK ] Starting httpd: (98)Address already in use: make_sock: could not bind to address [::]:80 (98)Address already in use: make_sock: could not bind to address 0.0.0.0:80 no listening sockets available, shutting down Unable to open logs
好吧,所以我
ps ax | grep http kill (the pid) service httpd start
而且一切都很好地启动..不到24小时,再次发生在同一时间 。 那次我“修好”之后,又过了13天, 又一次崩溃了 。 所以重申 – 在以下时间,httpd服务(我假设)重新启动并失败: 2012/4/27 04:13:52,2012/4/14 04:14:18,2012/4/13 04:12: 48
我检查了我的cron日志,发现以下条目,我不明白和谷歌失败了我。 它们发生在崩溃的时候,只发生在服务器崩溃的日子…太多是巧合,我想。
可疑/令人困惑的cron日志条目
[me@www httpd]# cat ../cro* | grep RELOAD Apr 13 09:40:01 www crond[5322]: (myusername2) RELOAD (cron/myusername2) Apr 14 04:13:01 www crond[5322]: (myusername2) RELOAD (cron/myusername2) Apr 14 14:27:01 www crond[5322]: (myusername2) RELOAD (cron/myusername2) [me@www httpd]# cat ../cro* | grep LIST Apr 27 04:13:14 www crontab[12973]: (root) LIST (myusername1) Apr 13 04:12:09 www crontab[30867]: (root) LIST (myusername2) Apr 13 09:39:57 www crontab[8274]: (root) LIST (myusername2) Apr 14 04:13:01 www crontab[12193]: (root) LIST (myusername2) Apr 14 14:26:09 www crontab[27898]: (root) LIST (myusername2) [me@www httpd]# cat ../cro* | grep REPLACE Apr 27 04:13:14 www crontab[12974]: (root) REPLACE (myusername1) Apr 13 04:12:09 www crontab[30868]: (root) REPLACE (myusername2) Apr 13 09:39:57 www crontab[8275]: (root) REPLACE (myusername2) Apr 14 04:13:01 www crontab[12194]: (root) REPLACE (myusername2) Apr 14 14:26:09 www crontab[27899]: (root) REPLACE (myusername2)
Cronlogging从-5分钟的崩溃
4/27/2012崩溃,04:13:52
Apr 27 04:10:01 www crond[5189]: (root) CMD (/usr/share/spamassassin/sa-update.cron 2>&1 | tee -a /var/log/sa-update.log) Apr 27 04:10:01 www crond[5192]: (root) CMD (/usr/lib/sa/sa1 1 1) Apr 27 04:10:01 www crond[5193]: (psaadm) CMD (/usr/local/psa/admin/bin/php /opt/plesk-billing/admin/sbin/runevents.php > /dev/null 2>&1) Apr 27 04:10:01 www crond[5195]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 27 04:10:01 www crond[5196]: (root) CMD (php /path/to/pimcore/cli/maintenance.php) Apr 27 04:10:01 www crond[5198]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 27 04:10:01 www crond[5200]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 27 04:11:01 www crond[5711]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 27 04:12:01 www crond[6152]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 27 04:12:01 www crond[6154]: (root) CMD (lynx -dump http://www.domain.com/script Apr 27 04:13:14 www crontab[12973]: (root) LIST (myusername1) Apr 27 04:13:14 www crontab[12974]: (root) REPLACE (myusername1)
4/14/2012崩溃,04:14:18
Apr 14 04:10:01 www crond[4712]: (root) CMD (/usr/share/spamassassin/sa-update.cron 2>&1 | tee -a /var/log/sa-update.log) Apr 14 04:10:01 www crond[4716]: (root) CMD (/usr/lib/sa/sa1 1 1) Apr 14 04:10:01 www crond[4718]: (psaadm) CMD (/usr/local/psa/admin/bin/php /opt/plesk-billing/admin/sbin/runevents.php > /dev/null 2>&1) Apr 14 04:10:01 www crond[4720]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 14 04:10:01 www crond[4721]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 14 04:10:01 www crond[4722]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 14 04:10:01 www crond[4724]: (root) CMD (php /path/to/pimcore) Apr 14 04:11:01 www crond[5190]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 14 04:12:01 www crond[5543]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 14 04:12:01 www crond[5545]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 14 04:13:01 www crontab[12193]: (root) LIST (user) Apr 14 04:13:01 www crontab[12194]: (root) REPLACE (myusername2) Apr 14 04:13:01 www crond[5322]: (myusername2) RELOAD (cron/myusername2) Apr 14 04:14:01 www crond[13896]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 14 04:14:01 www crond[13897]: (root) CMD (lynx -dump http://www.domain.com/script)
4/13/2012崩溃,04:12:48
Apr 13 04:10:01 www crond[23751]: (root) CMD (/usr/share/spamassassin/sa-update.cron 2>&1 | tee -a /var/log/sa-update.log) Apr 13 04:10:01 www crond[23754]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 13 04:10:01 www crond[23755]: (psaadm) CMD (/usr/local/psa/admin/bin/php /opt/plesk-billing/admin/sbin/runevents.php > /dev/null 2>&1) Apr 13 04:10:01 www crond[23756]: (root) CMD (/usr/lib/sa/sa1 1 1) Apr 13 04:10:01 www crond[23758]: (root) CMD (php /path/to/pimcore) Apr 13 04:10:01 www crond[23760]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 13 04:10:01 www crond[23761]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 13 04:11:01 www crond[24126]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 13 04:12:01 www crond[26995]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 13 04:12:01 www crond[26996]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 13 04:12:09 www crontab[30867]: (root) LIST (myusername2) Apr 13 04:12:09 www crontab[30868]: (root) REPLACE (myusername2) Apr 13 04:14:01 www crond[799]: (root) CMD (lynx -dump http://www.domain.com/script) Apr 13 04:14:01 www crond[800]: (root) CMD (lynx -dump http://www.domain.com/script)
所以作为一个低技术的临时解决scheme,我每天早上4:15 AM设置闹钟来检查我的站点是否停机。 请帮助我,我真的失去了睡眠。 谢谢。
编辑1:
/ var / spool / cron / myusername1 AND / var / spool / cron / myusername2
如果它很重要的话,都是空的。 他们都是(我的标准)用户信任的。
编辑2:
只是注意到/var/log/messages*的以下内容/var/log/messages*
Apr 27 04:13:11 www named[3541]: max open files (1024) is smaller than max sockets (4096) Apr 27 04:13:14 www named[3541]: max open files (1024) is smaller than max sockets (4096) Apr 13 04:12:08 www named[3541]: max open files (1024) is smaller than max sockets (4096) Apr 13 04:12:08 www named[3541]: max open files (1024) is smaller than max sockets (4096) Apr 14 04:12:59 www named[3541]: max open files (1024) is smaller than max sockets (4096) Apr 14 04:13:00 www named[3541]: max open files (1024) is smaller than max sockets (4096)
我希望我是一个真正的pipe理员,也许我可以更好地理解这一点。 不知道是否相关,但是因为这是我日志中唯一的其他消息,所以我可以发现这与服务器崩溃date/时间密切相关。