利用nginx VPS负载testing识别瓶颈

我试图优化一个数字海洋液滴(512mb),使用loader.io进行testing

我正在testing我的主页,这是HTTPS / PHP。 我设置了FastCGI页面caching,从100 req / sec到2000 req / sec。

但是超过2000 req / sec的任何事情都会导致很多超时和缓慢的响应(平均从20ms到1500ms)。 我试图找出瓶颈。 这不是CPU /内存,因为负载刚刚达到0.30,内存使用量大约是一半。 我尝试调整到更大的水滴,超时仍然发生。

这不是FastCGI,因为基本的.html文件的负载testing性能几乎相同。

在超时期间,error.log是空的。 没有什么似乎是抛出错误(我可以find)。 Kern.log有这个日志:

TCP: Possible SYN flooding on port 80. Sending cookies. Check SNMP counters TCP: Possible SYN flooding on port 443. Sending cookies. Check SNMP counters. 

我试图禁用syncookies,它停止了这些错误,但超时仍然存在。

在超时期间,我开始看到TIME_WAIT的积累:

 netstat -ntla | awk '{print $6}' | sort | uniq -c | sort -rn 6268 ESTABLISHED 831 TIME_WAIT 6 LISTEN 2 FIN_WAIT1 1 Foreign 1 established) 

我的问题是,我还能在哪里确定这里的瓶颈? 是否有其他错误日志或命令我可以用来监视?

这里是我的nginx.conf(FastCGI和常规浏览器caching在我的默认文件)。 我试过multi_accept,这似乎加重了超时。 我知道worker_connections是荒谬的,但似乎并不重要,我提高或降低了多less:

 user www-data; worker_processes auto; worker_rlimit_nofile 200000; pid /run/nginx.pid; events { worker_connections 200000; # multi_accept on; use epoll; } http { ## # Basic Settings ## open_file_cache max=200000 inactive=20s; open_file_cache_valid 30s; open_file_cache_min_uses 2; open_file_cache_errors on; server_tokens off; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 30; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; ## # Logging Settings ## access_log off; # access_log /var/log/nginx/access.log; error_log /var/log/nginx/error.log; ## # Gzip Settings ## gzip on; gzip_disable "msie6"; gzip_vary on; # gzip_proxied any; # gzip_comp_level 6; # gzip_buffers 16 8k; # gzip_http_version 1.1; gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript; include /etc/nginx/conf.d/*.conf; include /etc/nginx/sites-enabled/*; } 

这是我的sysctl.conf

 ### IMPROVE SYSTEM MEMORY MANAGEMENT ### # Increase size of file handles and inode cache fs.file-max = 2097152 # Do less swapping vm.swappiness = 10 vm.dirty_ratio = 60 vm.dirty_background_ratio = 2 ### GENERAL NETWORK SECURITY OPTIONS ### # Number of times SYNACKs for passive TCP connection. net.ipv4.tcp_synack_retries = 2 # Allowed local port range net.ipv4.ip_local_port_range = 2000 65535 # Protect Against TCP Time-Wait net.ipv4.tcp_rfc1337 = 1 # Decrease the time default value for tcp_fin_timeout connection net.ipv4.tcp_fin_timeout = 15 # Decrease the time default value for connections to keep alive net.ipv4.tcp_keepalive_time = 300 net.ipv4.tcp_keepalive_probes = 5 net.ipv4.tcp_keepalive_intvl = 15 net.ipv4.tcp_syncookies = 1 ### TUNING NETWORK PERFORMANCE ### # Default Socket Receive Buffer net.core.rmem_default = 31457280 # Maximum Socket Receive Buffer net.core.rmem_max = 12582912 # Default Socket Send Buffer net.core.wmem_default = 31457280 # Maximum Socket Send Buffer net.core.wmem_max = 12582912 # Increase number of incoming connections net.core.somaxconn = 4096 

我把这些放在limits.conf中:

 * hard nofile 500000 * soft nofile 500000 root hard nofile 500000 root soft nofile 500000