Segmode错误4在libmysqlclient.so中

几天后,我在我的系统日志中看到这样的消息:

Sep 23 14:28:42 server kernel: [138926.637593] php5-fpm[6455]: segfault at 7f9ade735018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 14:28:44 server kernel: [138928.314016] php5-fpm[22742]: segfault at 7f9ade3db018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 14:32:11 server kernel: [139135.318287] php5-fpm[16887]: segfault at 7f9ade4b3018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 14:32:49 server kernel: [139173.050377] php5-fpm[668]: segfault at 7f9ade61a018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 14:33:19 server kernel: [139203.396935] php5-fpm[26277]: segfault at 7f9ade6c0018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 14:35:06 server kernel: [139310.048740] php5-fpm[27017]: segfault at 7f9ade46c018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 14:35:19 server kernel: [139323.494188] php5-fpm[31263]: segfault at 7f9ade5e2018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 14:36:10 server kernel: [139374.904308] php5-fpm[26422]: segfault at 7f9ade6cf018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 14:37:25 server kernel: [139449.360384] php5-fpm[20806]: segfault at 7f9ade644018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] 

我使用的是Debian 8和MariaDB。 一开始只有两三个小时,但现在每小时几次。 经过一番研究,我明白应该是一个记忆问题,但是我没有find解决的办法。

这是我在mysqltuner中看到的:

 -------- Storage Engine Statistics ------------------------------------------- [--] Status: +ARCHIVE +Aria +BLACKHOLE +CSV +FEDERATED +InnoDB +MRG_MyISAM [--] Data in InnoDB tables: 2G (Tables: 79) [--] Data in MyISAM tables: 96M (Tables: 146) [--] Data in PERFORMANCE_SCHEMA tables: 0B (Tables: 52) [!!] Total fragmented tables: 34 -------- Security Recommendations ------------------------------------------- [OK] All database users have passwords assigned -------- Performance Metrics ------------------------------------------------- [--] Up for: 1d 16h 44m 38s (73M q [502.853 qps], 196K conn, TX: 572B, RX: 14B) [--] Reads / Writes: 97% / 3% [--] Total buffers: 17.3G global + 56.2M per thread (500 max threads) [!!] Maximum possible memory usage: 44.8G (142% of installed RAM) [OK] Slow queries: 0% (2K/73M) [OK] Highest usage of available connections: 28% (141/500) [OK] Key buffer size / total MyISAM indexes: 1.0G/32.6M [OK] Key buffer hit rate: 100.0% (132M cached / 53K reads) [OK] Query cache efficiency: 44.9% (50M cached / 113M selects) [!!] Query cache prunes per day: 260596 [OK] Sorts requiring temporary tables: 0% (2K temp sorts / 2M sorts) [OK] Temporary tables created on disk: 21% (6K on disk / 28K total) [OK] Thread cache hit rate: 99% (141 created / 196K connections) [OK] Table cache hit rate: 72% (500 open / 692 opened) [OK] Open file limit used: 17% (429/2K) [OK] Table locks acquired immediately: 99% (25M immediate / 25M locks) [OK] InnoDB buffer pool / data size: 16.0G/2.4G [!!] InnoDB log waits: 30 

所以最大的内存使用太高,但我调整了我的innodb缓冲池大小为16Go​​,对于32Go RAM应该是好的,我不知道如何优化这个。

事情是,我在服务器中的内存一般用法总是低于89%(加上caching)。 MySQL实际上使用了50.6%的RAM。 我不知道这一切之间是否有联系,但我赞成把它放在这里。 否则,在MySQL方面似乎一切正常…

最后,这个my.cnf中的主要variables我调整了可能对此有影响:

 max_connections = 100 max_heap_table_size = 64M read_buffer_size = 4M read_rnd_buffer_size = 32M sort_buffer_size = 8M query_cache_size = 256M query_cache_limit = 4M query_cache_type = 1 query_cache_strip_comments =1 thread_stack = 192K transaction_isolation = READ-COMMITTED tmp_table_size = 64M nnodb_additional_mem_pool_size = 16M innodb_buffer_pool_size = 16G thread_cache_size = 4M max_connections = 500 join_buffer_size = 12M interactive_timeout = 30 wait_timeout = 30 open_files_limit = 800 innodb_file_per_table key_buffer_size = 1G table_open_cache = 500 innodb_log_file_size = 256M 

两天前,服务器在syslog中无故崩溃,执行了segfault。 segfault可以使系统崩溃吗? 任何想法的段错误原因? 有几种方法可以了解问题的根源?

好,所以这些segfaults对应于我的客户网站中的错误。 对于这些segfaults:

 Sep 23 21:39:07 server kernel: [164772.739464] php5-fpm[10375]: segfault at 7f9ade4be018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 22:08:47 server kernel: [166554.093566] php5-fpm[12582]: segfault at 7f9ade5d9018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 22:09:17 server kernel: [166584.252833] php5-fpm[31132]: segfault at 7f9ade669018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 22:10:30 server kernel: [166656.875041] php5-fpm[24992]: segfault at 7f9ade6e9018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 22:39:19 server kernel: [168387.935728] php5-fpm[27380]: segfault at 7f9ade654018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 22:48:55 server kernel: [168963.750727] php5-fpm[6309]: segfault at 7f9ade371018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 22:49:33 server kernel: [169001.961155] php5-fpm[15416]: segfault at 7f9ade40f018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 23:04:22 server kernel: [169892.025852] php5-fpm[5763]: segfault at 7f9ade422018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 23:04:50 server kernel: [169919.986130] php5-fpm[16016]: segfault at 7f9ade5cf018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] Sep 23 23:09:11 server kernel: [170181.423023] php5-fpm[5129]: segfault at 7f9ade6cc018 ip 00007f9ae4026772 sp 00007ffd69b4fad0 error 4 in libmysqlclient.so.18.0.0[7f9ae3ff9000+2f1000] 

我得到了我的nginx日志错误,正是在同一时间,所以不是巧合的错误:

 2017/09/23 21:39:07 [error] 808#0: *7346034 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.1.1.1, server: example.com, request: "GET /page/a-page HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "example.com", referrer: "https://example.com/page/another-page" 2017/09/23 22:08:47 [error] 808#0: *7443193 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.1.1.1, server: example.com, request: "GET /engine/search?q=query", upstream: "fastcgi://127.0.0.1:9000", host: "example.com" 2017/09/23 22:09:17 [error] 808#0: *7450102 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.1.1.1, server: example.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "example.com" 2017/09/23 22:10:30 [error] 808#0: *7450989 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.1.1.1, server: example.com, request: "GET /page/a-page HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "example.com", referrer: "https://example.com/page/another-page" 2017/09/23 22:39:19 [error] 808#0: *7553713 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 21.1.1.1, server: example.com, request: "GET /page/a-page HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "example.com", referrer: "https://example.com/engine/search?q=query" 2017/09/23 22:48:55 [error] 7800#0: *7578116 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.1.1.1, server: example.com, request: "GET /engine/search?q=query HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "example.com" 2017/09/23 22:49:33 [error] 7800#0: *7582962 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.1.1.1, server: example.com, request: "GET /page/a-page HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "example.com" 2017/09/23 23:04:22 [error] 7800#0: *7639594 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.1.1.1, server: example.com, request: "GET /page/a-page HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "example.com" 2017/09/23 23:04:50 [error] 7800#0: *7636486 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.1.1.1, server: example.com, request: "GET /page/a-page HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "example.com", referrer: "https://example.com/engine/search?q=query" 2017/09/23 23:09:11 [error] 7800#0: *7654327 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 1.1.1.1, server: example.com, request: "GET /engine/search?q=query3 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "example.com" 

所以这是我的这个网站的nginx虚拟主机:

 server { listen 80; listen 443; server_name example.com; if ($http_x_forwarded_proto = "http") { return 301 https://$server_name$request_uri; } ssl on; ssl_certificate /etc/nginx/cloudflare_cert/fullchain.pem; ssl_certificate_key /etc/nginx/cloudflare_cert/privkey.pem; include /etc/nginx/conf.d/cloudflare; root /var/www/website; index index.php; location / { # main rewrite rule try_files $uri $uri/ /index.php?/$request_uri; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header Host $http_host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forward-For $proxy_add_x_forwarded_for; proxy_set_header X-Forward-Proto http; proxy_set_header X-Nginx-Proxy true; } # php parsing location ~ \.php$ { include fastcgi_params; fastcgi_pass 127.0.0.1:9000; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; } error_log /var/log/nginx/example_error.log; access_log /var/log/nginx/example_access.log; } server { listen 80; listen 443; server_name www.example.com; return 301 https://example.com$request_uri; } 

所以这是一个客户端问题。 现在我必须了解recv() failed (104: Connection reset by peer) while reading response header from upstream何时recv() failed (104: Connection reset by peer) while reading response header from upstream以及如何解决这个问题。