我在1GB的单核VPS上运行Nginx 1.5.1和PHP-FPM(PHP 5.3.26)的Drupal 6,SSD存储上有3GB的交换空间。 我刚刚从共享主机切换到这个非托pipe的VPS,因为我的网站变得太重了,所以我仍然在学习绳索。 我有很高的stream量,我没有真正注意到,但谷歌Adsense通常每天接近3万页的浏览量。 我通常有50到80个经过身份validation的用户login,另外还有几百个匿名用户在任何特定时间点击Boost静态HTMLcaching。 我最多configuration了10个PHP-FPMsubprocess。 我正在使用“ondemand”PHP-FPM进程pipe理器。
我偶尔遇到一个很难debugging的错误,因为它看起来是随机的。 用户可能有30多个故意的post,其中1个被复制。 我configuration了它,所以后button被禁用后,第一次点击,所以这不是双击的结果。 实际上,当我发布的时候,它甚至发生在我身上。 双职位发生在几秒钟内。 我检查了日志文件,并且重复的post似乎总是对应于nginx的POST错误: recv() failed (104: Connection reset by peer) while reading response header from upstream 。 而这个事件似乎涉及index.php和PHP-FPM工作进程的后续SIGTERM执行超时错误的PHP-FPM错误。
以下是nginx访问和错误日志以及PHP-FPM错误日志:
nginx_acess.log文件:
1.2.3.4 - - [02/Jul/2013:12:34:34 -0500] "POST /comment/reply/22802/420734?quote=1 HTTP/1.1" 302 5 "http://example.com/comment/reply/22802/420734?quote=1" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:34 -0500] "GET /node/22802 HTTP/1.1" 200 18775 "http://example.com/comment/reply/22802/420734?quote=1" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:35 -0500] "GET /sites/all/modules/ad/serve.php?o=image&a=21333 HTTP/1.1" 200 707 "http://example.com/node/22802" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:35 -0500] "GET /sites/all/modules/ad/serve.php?o=image&a=21121 HTTP/1.1" 200 707 "http://example.com/node/22802" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:35 -0500] "GET /sites/all/modules/ad/serve.php?o=image&a=21122 HTTP/1.1" 200 707 "http://example.com/node/22802" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:43 -0500] "GET /comment/delete/420748 HTTP/1.1" 200 5262 "http://example.com/node/22802" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:44 -0500] "GET /sites/all/modules/ad/serve.php?o=image&a=20342 HTTP/1.1" 200 707 "http://example.com/comment/delete/420748" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:44 -0500] "GET /sites/all/modules/ad/serve.php?o=image&a=21333 HTTP/1.1" 200 707 "http://example.com/comment/delete/420748" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:44 -0500] "GET /sites/all/modules/ad/serve.php?o=image&a=21121 HTTP/1.1" 200 707 "http://example.com/comment/delete/420748" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:45 -0500] "POST /comment/delete/420748 HTTP/1.1" 302 5 "http://example.com/comment/delete/420748" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:46 -0500] "GET /node/22802 HTTP/1.1" 200 18533 "http://example.com/comment/delete/420748" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:47 -0500] "GET /sites/all/modules/ad/serve.php?o=image&a=21406 HTTP/1.1" 200 707 "http://example.com/node/22802" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:47 -0500] "GET /sites/all/modules/ad/serve.php?o=image&a=21121 HTTP/1.1" 200 707 "http://example.com/node/22802" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0" 1.2.3.4 - - [02/Jul/2013:12:34:47 -0500] "GET /sites/all/modules/ad/serve.php?o=image&a=20343 HTTP/1.1" 200 707 "http://example.com/node/22802" "Mozilla/5.0 (X11; Linux i686; rv:21.0) Gecko/20100101 Firefox/21.0"
nginx_error.log文件:
2013/07/02 11:12:52 [error] 1821#0: *2140 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/node/22802" 2013/07/02 11:16:23 [error] 1821#0: *3020 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/node/22802" 2013/07/02 11:18:13 [error] 1821#0: *3375 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET /node/22763 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/node/22763" 2013/07/02 11:18:43 [error] 1821#0: *3576 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET /comment/edit/420694 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/node/22763" 2013/07/02 11:19:33 [error] 1821#0: *3576 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET /node/22763 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/node/22763/edit" 2013/07/02 11:22:33 [error] 1821#0: *4397 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET /forum HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/" 2013/07/02 11:29:23 [error] 1821#0: *5811 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET /node/22470 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/" 2013/07/02 11:34:43 [error] 1821#0: *6794 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET /recent-posts HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/forum" 2013/07/02 11:41:33 [error] 1821#0: *8082 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET /sites/all/modules/ad/serve.php?o=image&a=20343 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/node/22802" 2013/07/02 11:50:03 [error] 1821#0: *9435 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET /forum HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/" 2013/07/02 11:55:21 [error] 1821#0: *10378 open() "/var/www/drupal6/sites/all/modules/smileys/packs/Roving/no-swear.png" failed (2: No such file or directory), client: 1.2.3.4, server: example.com, request: "GET /sites/all/modules/smileys/packs/Roving/no-swear.png HTTP/1.1", host: "example.com", referrer: "http://example.com/node/22802/edit" 2013/07/02 12:02:33 [error] 1821#0: *11677 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET /user/5170/track/navigation HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/user/5170" 2013/07/02 12:03:03 [error] 1821#0: *11736 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET /node/15888 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/user/5170/track/navigation" 2013/07/02 12:15:23 [error] 1821#0: *13882 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/admin/reports/access/44258972" 2013/07/02 12:34:33 [error] 1821#0: *17088 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.2.3.4, server: example.com, request: "POST /comment/reply/22802/420734?quote=1 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9532", host: "example.com", referrer: "http://example.com/comment/reply/22802/420734?quote=1"
php-fpm_error.log文件:
[02-Jul-2013 12:34:13] WARNING: [pool www] child 5768, script '/var/www/drupal6/index.php' (request: "GET /index.php") execution timed out (39.990074 sec), terminating [02-Jul-2013 12:34:13] WARNING: [pool www] child 5767, script '/var/www/drupal6/index.php' (request: "GET /index.php") execution timed out (40.002037 sec), terminating [02-Jul-2013 12:34:13] WARNING: [pool www] child 5767 exited on signal 15 (SIGTERM) after 50.005181 seconds from start [02-Jul-2013 12:34:13] NOTICE: [pool www] child 5796 started [02-Jul-2013 12:34:13] WARNING: [pool www] child 5768 exited on signal 15 (SIGTERM) after 40.019244 seconds from start [02-Jul-2013 12:34:13] NOTICE: [pool www] child 5797 started [02-Jul-2013 12:34:33] WARNING: [pool www] child 5769, script '/var/www/drupal6/index.php' (request: "POST /index.php") execution timed out (59.990557 sec), terminating [02-Jul-2013 12:34:33] WARNING: [pool www] child 5769 exited on signal 15 (SIGTERM) after 60.014359 seconds from start [02-Jul-2013 12:34:33] NOTICE: [pool www] child 5801 started
他们被修剪和稍微混淆,以显示一个重复的post错误,发生在今天12:34我的IP地址为“1.2.3.4”。 发生该事件的节点是22802。
当我在我以前的共享虚拟主机上运行Apache / FastCGI时,没有发生这个问题。 我还应该提到,我正在使用Redis进行caching,并使用Zend Optimizer + opcache。 但我试图禁用这两种机制,以避免愚蠢的错误,没有任何区别。
感谢您的任何帮助,您可以提供!
那么,我要回答我自己的问题。 这个问题显然与我使用的request_terminate_timeout = 30s值有关,可能与ondemand FPM进程pipe理器结合使用。 双重post总是与PHP-FPM超时错误相结合,之后立即杀死一个subprocess。 所以我禁用了request_terminate_timeout ,它似乎是多余的,因为php.ini文件已经指定了30秒的超时时间。 我也意识到,我并不需要ondemand进程pipe理器,因为我是这个盒子上唯一具有相当稳定负载的用户,所以我切换到static并将pm.max_requests设置为100,这样可以防止内存泄漏。
其中一个或两个这些变化已经有效地消除了重复的post。