追查高Linux负载 – 硬盘故障或太多的中断? (ksoftirqd时间437:44.13)

服务器统计:

“cat / proc / version”输出

Linux version 2.6.18-308.24.1.el5 ([email protected]) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-52)) #1 SMP Tue Dec 4 17:43:34 EST 2012 

ethtool eth0输出:

 Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: pumbg Wake-on: g Current message level: 0x00000001 (1) Link detected: yes 

cat / proc / cpuinfo | grep MHz输出:

 cpu MHz : 3201.000 cpu MHz : 3201.000 cpu MHz : 3201.000 cpu MHz : 3201.000 cpu MHz : 3201.000 cpu MHz : 3201.000 cpu MHz : 3201.000 cpu MHz : 3201.000 

我不是很擅长Linux,一直试图追查为什么这个服务器有这么高的负载。 我相信这是因为硬盘速度不够快或者中断太多,因为“ksoftirqd”进程有时会占用大量的CPU,并且似乎在运行很长一段时间。

我一直在网上search如何正确诊断这一点,我相信我已经find了如何正确提出有用的信息,但不幸的是结果仍然离开我的困惑。

顶部输出

 top - 08:40:31 up 132 days, 2:06, 2 users, load average: 84.25, 63.29, 63.02 Tasks: 3214 total, 8 running, 3206 sleeping, 0 stopped, 0 zombie Cpu(s): 18.6%us, 3.2%sy, 0.0%ni, 41.1%id, 26.8%wa, 0.3%hi, 9.9%si, 0.0%st Mem: 32934596k total, 25811556k used, 7123040k free, 329988k buffers Swap: 4194296k total, 128k used, 4194168k free, 10888060k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1846 nobody 16 0 125m 12m 4944 S 3.9 0.0 0:00.47 httpd 13490 nobody 15 0 126m 13m 5064 S 2.9 0.0 0:02.37 httpd 20137 everprox 16 0 127m 13m 4908 D 2.6 0.0 0:00.76 httpd 1827 everprox 15 0 127m 13m 4924 S 2.0 0.0 0:00.50 httpd 16574 root 15 0 15120 3480 812 R 2.0 0.0 0:00.15 top 16894 nobody 15 0 126m 84m 944 S 2.0 0.3 946:10.13 nginx 6347 root 16 0 15112 3552 816 S 1.6 0.0 3:16.46 top 7115 named 25 0 422m 52m 2084 S 1.6 0.2 4089:16 named 16575 everprox 16 0 126m 11m 3992 D 1.6 0.0 0:00.05 httpd 16891 nobody 15 0 149m 89m 944 S 1.6 0.3 939:49.39 nginx 16892 nobody 15 0 126m 84m 944 S 1.6 0.3 940:41.47 nginx 26041 everprox 15 0 126m 13m 5076 S 1.6 0.0 0:01.55 httpd 26113 nobody 15 0 126m 13m 5024 S 1.6 0.0 0:02.46 httpd 4345 everprox 15 0 126m 13m 5040 S 1.3 0.0 0:01.82 httpd 13131 everprox 15 0 125m 12m 5072 S 1.3 0.0 0:01.82 httpd 14058 everprox 15 0 127m 13m 5132 D 1.3 0.0 0:01.57 httpd 14554 nobody 15 0 126m 13m 4896 S 1.3 0.0 0:00.74 httpd 26209 everprox 15 0 126m 13m 5044 S 1.3 0.0 0:03.08 httpd 26283 everprox 16 0 125m 12m 5108 D 1.3 0.0 0:02.06 httpd 4360 everprox 15 0 126m 13m 5088 S 1.0 0.0 0:01.93 httpd 12997 everprox 15 0 126m 13m 5052 S 1.0 0.0 0:03.33 httpd 13351 nobody 15 0 127m 13m 5168 S 1.0 0.0 0:02.43 httpd 13705 everprox 15 0 126m 13m 5076 D 1.0 0.0 0:01.55 httpd 13870 nobody 16 0 126m 13m 5088 S 1.0 0.0 0:02.73 httpd 13931 nobody 15 0 126m 13m 5064 S 1.0 0.0 0:02.57 httpd 14008 everprox 15 0 127m 13m 5156 D 1.0 0.0 0:03.39 httpd 14009 everprox 15 0 126m 13m 5064 D 1.0 0.0 0:01.94 httpd 14215 everprox 15 0 126m 13m 5044 S 1.0 0.0 0:01.68 httpd 14550 everprox 16 0 126m 12m 5088 D 1.0 0.0 0:02.73 httpd 14556 nobody 15 0 126m 13m 5096 S 1.0 0.0 0:03.57 httpd 14587 everprox 15 0 126m 12m 5072 S 1.0 0.0 0:03.74 httpd 14625 nobody 15 0 126m 13m 5108 S 1.0 0.0 0:02.93 httpd 14671 everprox 15 0 126m 13m 5048 S 1.0 0.0 0:02.92 httpd 16893 nobody 15 0 125m 81m 944 R 1.0 0.3 936:15.00 nginx 16896 nobody 15 0 127m 87m 944 S 1.0 0.3 939:30.33 nginx 16897 nobody 15 0 122m 84m 944 R 1.0 0.3 939:11.18 nginx 20121 nobody 16 0 125m 11m 4752 S 1.0 0.0 0:00.63 httpd 20122 everprox 16 0 126m 13m 5036 D 1.0 0.0 0:00.60 httpd 25391 everprox 16 0 126m 13m 5108 D 1.0 0.0 0:02.74 httpd 25463 everprox 15 0 126m 13m 5036 D 1.0 0.0 0:02.45 httpd 25514 everprox 16 0 126m 13m 5096 D 1.0 0.0 0:01.03 httpd 26130 everprox 15 0 126m 13m 5048 D 1.0 0.0 0:01.42 httpd 26220 nobody 15 0 126m 13m 5068 S 1.0 0.0 0:03.15 httpd 1833 nobody 16 0 126m 12m 4976 S 0.7 0.0 0:00.40 httpd 4364 everprox 15 0 125m 12m 5020 S 0.7 0.0 0:02.01 httpd 4370 nobody 16 0 126m 13m 5076 S 0.7 0.0 0:02.02 httpd 5499 everprox 15 0 126m 12m 4972 S 0.7 0.0 0:00.54 httpd 5507 everprox 16 0 126m 13m 5004 D 0.7 0.0 0:00.50 httpd 12984 everprox 16 0 127m 13m 5064 D 0.7 0.0 0:01.84 httpd 13004 everprox 15 0 126m 13m 5056 S 0.7 0.0 0:02.81 httpd 13029 everprox 16 0 126m 13m 5048 D 0.7 0.0 0:02.65 httpd 

free -mt输出

 root@echo [~]# free -mt total used free shared buffers cached Mem: 32162 25219 6943 0 322 10690 -/+ buffers/cache: 14206 17956 Swap: 4095 0 4095 Total: 36258 25219 11039 

iostat:

 root@echo [~]# iostat Linux 2.6.18-308.24.1.el5 (echo.uk7.org) 10/17/2013 avg-cpu: %user %nice %system %iowait %steal %idle 26.95 0.08 12.17 3.42 0.00 57.38 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 111.64 19.88 2038.06 226836250 23259888204 sda1 0.00 0.00 0.00 2688 1076 sda2 111.64 19.88 2038.06 226833282 23259887128 dm-0 255.26 19.88 2038.06 226831554 23259887880 dm-1 0.00 0.00 0.00 1160 344 

sar -I SUM输出:

 Linux 2.6.18-308.24.1.el5 (echo.uk7.org) 10/17/2013 12:00:01 AM INTR intr/s 12:10:01 AM sum 17315.21 12:20:01 AM sum 23640.63 12:30:05 AM sum 26005.42 12:40:05 AM sum 27051.29 12:50:01 AM sum 25887.09 01:00:01 AM sum 25915.91 01:10:02 AM sum 25643.99 01:20:01 AM sum 25590.73 01:30:01 AM sum 25843.38 01:40:01 AM sum 25817.66 01:50:01 AM sum 25937.93 02:00:03 AM sum 25836.42 02:10:01 AM sum 25850.17 02:20:01 AM sum 25788.77 02:30:01 AM sum 25680.55 02:40:01 AM sum 25871.60 02:50:01 AM sum 27089.20 03:00:01 AM sum 26069.86 03:10:01 AM sum 26368.91 03:20:01 AM sum 25977.64 03:30:04 AM sum 26038.12 03:40:05 AM sum 26278.10 03:50:02 AM sum 25988.70 04:00:04 AM sum 26723.36 04:10:05 AM sum 26150.12 04:20:03 AM sum 25904.27 04:30:01 AM sum 26030.90 04:40:09 AM sum 25714.96 04:50:10 AM sum 25732.73 05:00:01 AM sum 24374.81 05:10:01 AM sum 21990.37 05:20:01 AM sum 22917.79 05:30:03 AM sum 22847.98 05:40:03 AM sum 24926.45 05:50:01 AM sum 24986.11 06:00:01 AM sum 24935.01 06:10:04 AM sum 25438.65 06:20:01 AM sum 25430.91 06:30:03 AM sum 26959.88 06:40:01 AM sum 26723.60 06:50:01 AM sum 26422.57 07:00:01 AM sum 26052.94 07:10:07 AM sum 27915.00 07:20:01 AM sum 25868.20 07:30:06 AM sum 25811.18 07:40:05 AM sum 25843.82 07:50:01 AM sum 25814.03 08:00:01 AM sum 25554.51 08:10:01 AM sum 24948.75 08:20:01 AM sum 25413.89 08:30:06 AM sum 25860.78 08:40:01 AM sum 25819.49 Average: sum 25512.26 

sar -w输出:

 Linux 2.6.18-308.24.1.el5 (echo.uk7.org) 10/17/2013 12:00:01 AM cswch/s 12:10:01 AM 150959.09 12:20:01 AM 108496.38 12:30:05 AM 32508.30 12:40:05 AM 17555.99 12:50:01 AM 21667.90 01:00:01 AM 89007.13 01:10:02 AM 95902.66 01:20:01 AM 83193.93 01:30:01 AM 76984.23 01:40:01 AM 82111.94 01:50:01 AM 77520.72 02:00:03 AM 39197.94 02:10:01 AM 22047.28 02:20:01 AM 21469.65 02:30:01 AM 26522.87 02:40:01 AM 63104.71 02:50:01 AM 85472.19 03:00:01 AM 40869.59 03:10:01 AM 34278.48 03:20:01 AM 15844.37 03:30:04 AM 16504.44 03:40:05 AM 25177.02 03:50:02 AM 18018.24 04:00:04 AM 27187.20 04:10:05 AM 29010.02 04:20:03 AM 40022.62 04:30:01 AM 69535.67 04:40:09 AM 96043.34 04:50:10 AM 82239.90 05:00:01 AM 128834.10 05:10:01 AM 167916.98 05:20:01 AM 130773.27 05:30:03 AM 125977.75 05:40:03 AM 112561.88 05:50:01 AM 94872.38 06:00:01 AM 98417.10 06:10:04 AM 91611.66 06:20:01 AM 94804.15 06:30:03 AM 75834.69 06:40:01 AM 54488.51 06:50:01 AM 24460.81 07:00:01 AM 16950.60 07:10:07 AM 24471.96 07:20:01 AM 16379.81 07:30:06 AM 15711.76 07:40:05 AM 15708.03 07:50:01 AM 16305.04 08:00:01 AM 18454.64 08:10:01 AM 73621.10 08:20:01 AM 57868.75 08:30:06 AM 15440.36 08:40:01 AM 14954.61 08:50:01 AM 14906.57 Average: 58290.70 

sar -d 5 0输出:

 root@echo [~]# sar -d 5 0 Linux 2.6.18-308.24.1.el5 (echo.uk7.org) 10/17/2013 08:52:50 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 08:52:55 AM dev8-0 104.40 0.00 1760.00 16.86 19.86 190.26 1.64 17.12 08:52:55 AM dev8-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 08:52:55 AM dev8-2 104.40 0.00 1760.00 16.86 19.86 190.26 1.64 17.12 08:52:55 AM dev253-0 220.00 0.00 1760.00 8.00 40.12 182.36 0.78 17.12 08:52:55 AM dev253-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 08:52:55 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 08:53:00 AM dev8-0 98.40 0.00 1771.20 18.00 17.44 177.22 1.62 15.92 08:53:00 AM dev8-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 08:53:00 AM dev8-2 98.40 0.00 1771.20 18.00 17.44 177.22 1.62 15.92 08:53:00 AM dev253-0 221.40 0.00 1771.20 8.00 36.61 165.36 0.72 15.92 08:53:00 AM dev253-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 08:53:00 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 08:53:05 AM dev8-0 109.20 0.00 1916.80 17.55 18.26 167.25 1.75 19.14 08:53:05 AM dev8-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 08:53:05 AM dev8-2 109.20 0.00 1916.80 17.55 18.26 167.25 1.75 19.14 08:53:05 AM dev253-0 239.60 0.00 1916.80 8.00 26.30 109.78 0.80 19.14 08:53:05 AM dev253-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 08:53:05 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 08:53:10 AM dev8-0 104.79 0.00 2000.80 19.09 18.60 177.46 1.68 17.62 08:53:10 AM dev8-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 08:53:10 AM dev8-2 104.79 0.00 2000.80 19.09 18.60 177.46 1.68 17.62 08:53:10 AM dev253-0 250.10 0.00 2000.80 8.00 38.19 152.70 0.70 17.62 08:53:10 AM dev253-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 08:53:10 AM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util 08:53:15 AM dev8-0 174.35 0.00 3148.70 18.06 21.08 120.73 1.63 28.36 08:53:15 AM dev8-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 08:53:15 AM dev8-2 174.35 0.00 3148.70 18.06 21.08 120.73 1.63 28.36 08:53:15 AM dev253-0 393.59 0.00 3148.70 8.00 39.29 99.81 0.72 28.36 08:53:15 AM dev253-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 

sar -W输出:

 12:00:01 AM pswpin/s pswpout/s 12:10:01 AM 0.00 0.00 12:20:01 AM 0.00 0.00 12:30:05 AM 0.00 0.00 12:40:05 AM 0.00 0.00 12:50:01 AM 0.00 0.00 01:00:01 AM 0.00 0.00 01:10:02 AM 0.00 0.00 01:20:01 AM 0.00 0.00 01:30:01 AM 0.00 0.00 01:40:01 AM 0.00 0.00 01:50:01 AM 0.00 0.00 02:00:03 AM 0.00 0.00 02:10:01 AM 0.00 0.00 02:20:01 AM 0.00 0.00 02:30:01 AM 0.00 0.00 02:40:01 AM 0.00 0.00 02:50:01 AM 0.00 0.00 03:00:01 AM 0.00 0.00 03:10:01 AM 0.00 0.00 03:20:01 AM 0.00 0.00 03:30:04 AM 0.00 0.00 03:40:05 AM 0.00 0.00 03:50:02 AM 0.00 0.00 04:00:04 AM 0.00 0.00 04:10:05 AM 0.00 0.00 04:20:03 AM 0.00 0.00 04:30:01 AM 0.00 0.00 04:40:09 AM 0.00 0.00 04:50:10 AM 0.00 0.00 05:00:01 AM 0.00 0.00 05:10:01 AM 0.00 0.00 05:20:01 AM 0.00 0.00 05:30:03 AM 0.00 0.00 05:40:03 AM 0.00 0.00 05:50:01 AM 0.00 0.00 06:00:01 AM 0.00 0.00 06:10:04 AM 0.00 0.00 06:20:01 AM 0.00 0.00 06:30:03 AM 0.00 0.00 06:40:01 AM 0.00 0.00 06:50:01 AM 0.00 0.00 07:00:01 AM 0.00 0.00 07:10:07 AM 0.00 0.00 07:20:01 AM 0.00 0.00 07:30:06 AM 0.01 0.00 07:40:05 AM 0.00 0.00 07:50:01 AM 0.00 0.00 08:00:01 AM 0.00 0.00 08:10:01 AM 0.00 0.00 08:20:01 AM 0.00 0.00 08:30:06 AM 0.00 0.00 08:40:01 AM 0.00 0.00 08:50:01 AM 0.00 0.00 Average: 0.00 0.00 

只是想知道有没有什么东西真正脱颖而出,对我有更多的了解,就像我上面所说的那样,我认为它是一个慢速的硬盘,SSD可能会更好,或者中断太多。

该服务器主要是托pipe基于Web的代理的networking托pipe服务器。 它使用mod_ruid2和nginxcp(cpanel addon)运行Apache 2.2.23。

谢谢。

它看起来像我是你的I / O绑定。 从top看,你用D标志看到很多任务。 这意味着它们在等待磁盘响应的I / O上被阻塞。 “平均负荷”基本上是指“x时间等待的x个任务”。

你也有吨(可能太多)工作线程,如果他们都是Apache。 看看调整你的服务器或获得更快的硬件。