strace没有显示任何线程在100%'系统'的CPU

我很难与我的Python应用程序跨多个线程100%的“系统”CPU时间达10秒以上。 我(有限)的理解是'系统'CPU时间是花在内核上的时间。 strace显示正在进行的内核调用和(可选)多久。 在这些暂停之一中,我设法在一个线程上运行strace,但是strace输出在那段时间内只有一个缺口,而且在这个缺口的任何一边都会显示很短的时间。

如何正确运行strace来找出发生什么内核调用?

在我的Python应用程序中,我使用单独的线程和psutil来检查高系统CPU的任何线程,然后将其注销。 一旦我看到一个我使用下面的命令行来启动strace:

 strace -tT -o ~/strace.log -p <PID of thread> 

从我的应用程序中logging(使用psutil下的psutil ):

 2015-05-19 08:05:47,025 WARNING:log.cpu Logging paused for 0.34s to 08:05:46.985768 - longer than expected - other threads probably delayed as well 2015-05-19 08:05:47,026 WARNING:log.cpu This process utilization (CPUs) user: 0.1, system: 52.4 | OS (%) scputimes(user=0.4, nice=0.0, system=80.2, idle=19.3, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0, guest=0.0, guest_nice=0.0), memory used: 16.6, threads: 101, fds: 70 2015-05-19 08:05:47,027 WARNING:log.cpu Thread 47441 utilization (CPUs) user: 0.0, system: 1.0 2015-05-19 08:05:47,512 WARNING:log.cpu This process utilization (CPUs) user: 0.1, system: 61.4 | OS (%) scputimes(user=0.5, nice=0.0, system=99.4, idle=0.1, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0, guest=0.0, guest_nice=0.0), memory used: 16.6, threads: 101, fds: 70 2015-05-19 08:05:47,516 WARNING:log.cpu Thread 47441 utilization (CPUs) user: 0.0, system: 0.9 2015-05-19 08:05:47,995 WARNING:log.cpu This process utilization (CPUs) user: 0.3, system: 63.7 | OS (%) scputimes(user=0.8, nice=0.0, system=99.2, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0, guest=0.0, guest_nice=0.0), memory used: 16.6, threads: 101, fds: 70 2015-05-19 08:05:47,995 WARNING:log.cpu Thread 47441 utilization (CPUs) user: 0.0, system: 0.8 2015-05-19 08:05:48,495 WARNING:log.cpu This process utilization (CPUs) user: 0.3, system: 63.8 | OS (%) scputimes(user=0.7, nice=0.0, system=99.3, idle=0.0, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0, guest=0.0, guest_nice=0.0), memory used: 16.6, threads: 101, fds: 70 2015-05-19 08:05:48,496 WARNING:log.cpu Thread 47441 utilization (CPUs) user: 0.0, system: 0.8 2015-05-19 08:05:52,140 WARNING:log.cpu Logging paused for 3.12s to 08:05:52.103363 - longer than expected - other threads probably delayed as well 2015-05-19 08:05:52,153 WARNING:log.cpu This process utilization (CPUs) user: 0.0, system: 63.6 | OS (%) scputimes(user=0.3, nice=0.0, system=98.5, idle=1.2, iowait=0.0, irq=0.0, softirq=0.0, steal=0.0, guest=0.0, guest_nice=0.0), memory used: 16.6, threads: 101, fds: 70 2015-05-19 08:05:52,180 WARNING:log.cpu Thread 47441 utilization (CPUs) user: 0.0, system: 1.0 

strace日志在该线程上同一时期ID 47441:

 08:05:46 futex(0x7f7b46b305ac, FUTEX_WAKE, 1) = 1 <0.000030> 08:05:46 sched_yield() = 0 <0.000016> 08:05:46 sched_yield() = 0 <0.000017> 08:05:46 sched_yield() = 0 <0.000021> 08:05:46 sched_yield() = 0 <0.000053> 08:05:46 sched_yield() = 0 <0.000023> 08:05:46 sched_yield() = 0 <0.000020> 08:05:46 sched_yield() = 0 <0.000019> 08:05:46 sched_yield() = 0 <0.000021> 08:05:46 futex(0x7f7b46b3202c, FUTEX_WAIT, 2, NULL) = 0 <0.000574> 08:05:46 futex(0x7f7b46b30f2c, FUTEX_WAKE, 1) = 1 <0.000019> 08:05:46 futex(0x7f7b46b3072c, FUTEX_WAKE, 1) = 1 <0.000028> 08:05:46 sched_yield() = 0 <0.000022> 08:05:46 futex(0x7f7b46b3202c, FUTEX_WAIT, 2, NULL) = 0 <0.000305> 08:05:46 futex(0x7f7b46b31fac, FUTEX_WAKE, 1) = 1 <0.000119> 08:05:46 futex(0x7f7b46b319ac, FUTEX_WAKE, 1) = 1 <0.000073> 08:05:46 futex(0x7f7b46b3202c, FUTEX_WAIT, 2, NULL) = 0 <0.000474> 08:05:46 futex(0x7f7b46b31f2c, FUTEX_WAKE, 1) = 1 <0.000094> 08:05:46 futex(0x7f7b46b30dac, FUTEX_WAKE, 1) = 1 <0.000096> 08:05:46 futex(0x7f7b46b3202c, FUTEX_WAIT, 2, NULL) = 0 <0.001139> 08:05:46 futex(0x7f7b46b31eac, FUTEX_WAKE, 1) = 1 <0.000156> 08:05:46 futex(0x7f7b46b303ac, FUTEX_WAKE, 1) = 1 <0.000059> 08:05:52 sched_yield() = 0 <0.000081> 08:05:52 sched_yield() = 0 <0.000050> 08:05:52 futex(0x7f7b46b3202c, FUTEX_WAIT, 2, NULL) = 0 <0.001603> 08:05:52 futex(0x7f7b46b31d2c, FUTEX_WAKE, 1) = 1 <0.000318> 08:05:52 futex(0x7f7b46b30f2c, FUTEX_WAKE, 1) = 1 <0.000259> 08:05:52 sched_yield() = 0 <0.000048> 08:05:52 sched_yield() = 0 <0.000060> 08:05:52 sched_yield() = 0 <0.000203> 08:05:52 futex(0x7f7b46b3202c, FUTEX_WAIT, 2, NULL) = 0 <0.000148> 08:05:52 sched_yield() = 0 <0.000036> 08:05:52 sched_yield() = 0 <0.000164> 08:05:52 futex(0x7f7b46b3202c, FUTEX_WAIT, 2, NULL) = 0 <0.003125> 08:05:52 sched_yield() = 0 <0.000053> 08:05:52 sched_yield() = 0 <0.000052> 08:05:52 sched_yield() = 0 <0.000023> 08:05:52 futex(0x7f7b46b3202c, FUTEX_WAIT, 2, NULL) = 0 <0.002264> 08:05:52 futex(0x7f7b46b3122c, FUTEX_WAKE, 1) = 1 <0.000119> 

注意strace是如何logging每秒100个条目的,然后是没有任何东西的6秒。 但是没有任何电话显示的时间超过了几个毫秒。 那么这个线程如何显示100%的“系统”CPU时间呢?

我已经用top来确认psutil正在报告正确的数字。 而且我确定我已经跟踪了正确的线程,因为6秒的间隙恰好对应于psutillogging的高CPU时间。

还有什么其他的工具可以用来确定是什么导致这个内核/系统时间?

环境: – Python 3.4.3(在Python 2.7中也出现问题) – CentOS 7内核3.10.0-229.1.2.el7.x86_64(在Ubuntu 12 LTS上也出现问题) – 64核心AMD机器