在Debian服务器上的CPU上检测到自我检测失速

我的服务器自2014年10月起在线。

Debian GNU/Linux 7.6 uname -r 3.10.23-xxxx-std-ipv6-64 

直到昨天晚上,它工作得很好。 昨晚它不负责任。 我从dadata中心控制面板重新启动它,结束everythin是确定的。 今天晚上,它在不同的时间再次冻结。

我没有安装任何新的软件,只是定期更新。

服务器日志文件显示:rcu_sched在CPU上自检失速

有任何想法吗?

 Mar 4 01:51:01 server4 kernel: INFO: rcu_sched self-detected stall on CPU { 6} (t=15001 jiffies g=78281006 c=78281005 q=5678) Mar 4 01:51:01 server4 kernel: sending NMI to all CPUs: Mar 4 01:51:01 server4 kernel: NMI backtrace for cpu 6 Mar 4 01:51:01 server4 kernel: CPU: 6 PID: 2057 Comm: ps Not tainted 3.10.23-xxxx-std-ipv6-64 #1 Mar 4 01:51:01 server4 kernel: Hardware name: /DH67BL, BIOS BLH6710H.86A.0160.2012.1204.1156 12/04/2012 Mar 4 01:51:01 server4 kernel: task: ffff8803a9dbd620 ti: ffff8807bac0c000 task.ti: ffff8807bac0c000 Mar 4 01:51:01 server4 kernel: RIP: 0010:[<ffffffff81607f80>] [<ffffffff81607f80>] delay_loop+0x30/0x30 Mar 4 01:51:01 server4 kernel: RSP: 0018:ffff88081f383de0 EFLAGS: 00000887 Mar 4 01:51:01 server4 kernel: RAX: 00000000833e8900 RBX: 0000000000002710 RCX: 00000000019e1c28 Mar 4 01:51:01 server4 kernel: RDX: 000000000033599b RSI: 0000000000000060 RDI: 000000000033599c Mar 4 01:51:01 server4 kernel: RBP: ffff88081f383de8 R08: 0000000000000400 R09: 000000000003d73d Mar 4 01:51:01 server4 kernel: R10: 0000000000000002 R11: 000000000003d73c R12: 0000000000000006 Mar 4 01:51:01 server4 kernel: R13: ffffffff82168340 R14: ffff88081f38d5e0 R15: 000000000000162e Mar 4 01:51:01 server4 kernel: FS: 00007fae36658700(0000) GS:ffff88081f380000(0000) knlGS:0000000000000000 Mar 4 01:51:01 server4 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 4 01:51:01 server4 kernel: CR2: 00007fae3665e000 CR3: 00000007eddd3000 CR4: 00000000001407e0 Mar 4 01:51:01 server4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Mar 4 01:51:01 server4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Mar 4 01:51:01 server4 kernel: Stack: Mar 4 01:51:01 server4 kernel: ffffffff8160806c ffff88081f383e08 ffffffff8106338a 0000000000000000 Mar 4 01:51:01 server4 kernel: ffffffff82168340 ffff88081f383e78 ffffffff81115cdd ffff88081f383e28 Mar 4 01:51:01 server4 kernel: ffffffff81117c17 ffff88081f383e68 ffffffff810e97df ffff8807bac0c000 Mar 4 01:51:01 server4 kernel: Call Trace: Mar 4 01:51:01 server4 kernel: <IRQ> Mar 4 01:51:01 server4 kernel: [<ffffffff8160806c>] ? __const_udelay+0x2c/0x30 Mar 4 01:51:01 server4 kernel: [<ffffffff8106338a>] arch_trigger_all_cpu_backtrace+0x6a/0xa0 Mar 4 01:51:01 server4 kernel: [<ffffffff81115cdd>] rcu_check_callbacks+0x2ed/0x560 Mar 4 01:51:01 server4 kernel: [<ffffffff81117c17>] ? acct_account_cputime+0x17/0x20 Mar 4 01:51:01 server4 kernel: [<ffffffff810e97df>] ? account_system_time+0xcf/0x180 Mar 4 01:51:01 server4 kernel: [<ffffffff810ca2c3>] update_process_times+0x43/0x80 Mar 4 01:51:01 server4 kernel: [<ffffffff810f95c1>] tick_sched_handle.isra.12+0x31/0x40 Mar 4 01:51:01 server4 kernel: [<ffffffff810f9704>] tick_sched_timer+0x44/0x70 Mar 4 01:51:01 server4 kernel: [<ffffffff810df07a>] __run_hrtimer.isra.29+0x4a/0xd0 Mar 4 01:51:01 server4 kernel: [<ffffffff810df9b5>] hrtimer_interrupt+0xf5/0x230 Mar 4 01:51:01 server4 kernel: [<ffffffff810626c4>] smp_apic_timer_interrupt+0x64/0xa0 Mar 4 01:51:01 server4 kernel: [<ffffffff81d421ca>] apic_timer_interrupt+0x6a/0x70 Mar 4 01:51:01 server4 kernel: <EOI> Mar 4 01:51:01 server4 kernel: [<ffffffff81606c0a>] ? vsnprintf+0x3ea/0x640 Mar 4 01:51:01 server4 kernel: [<ffffffff81d409fd>] ? _raw_spin_lock+0x1d/0x30 Mar 4 01:51:01 server4 kernel: [<ffffffff8118632a>] __d_lookup+0x7a/0x160 Mar 4 01:51:01 server4 kernel: [<ffffffff8117ad56>] ? path_get+0x26/0x40 Mar 4 01:51:01 server4 kernel: [<ffffffff8117b771>] lookup_fast+0x161/0x2e0 Mar 4 01:51:01 server4 kernel: [<ffffffff811cf2fb>] ? proc_pid_permission+0xcb/0xe0 Mar 4 01:51:01 server4 kernel: [<ffffffff8117d581>] do_last.isra.62+0x171/0xc20 Mar 4 01:51:01 server4 kernel: [<ffffffff8117aab3>] ? inode_permission+0x13/0x50 Mar 4 01:51:01 server4 kernel: [<ffffffff8117bf35>] ? link_path_walk+0x245/0x810 Mar 4 01:51:01 server4 kernel: [<ffffffff8117e0de>] path_openat.isra.63+0xae/0x460 Mar 4 01:51:01 server4 kernel: [<ffffffff8117e4cc>] do_filp_open+0x3c/0x90 Mar 4 01:51:01 server4 kernel: [<ffffffff8118b212>] ? __alloc_fd+0x42/0x100 Mar 4 01:51:01 server4 kernel: [<ffffffff811701ff>] do_sys_open+0xef/0x1d0 Mar 4 01:51:01 server4 kernel: [<ffffffff811702fd>] SyS_open+0x1d/0x20 Mar 4 01:51:01 server4 kernel: [<ffffffff81d41692>] system_call_fastpath+0x16/0x1b Mar 4 01:51:01 server4 kernel: Code: 89 e5 48 85 c0 74 19 eb 02 66 90 eb 0e 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 48 ff c8 75 fb 48 ff c8 5d c3 66 0f 1f 44 00 00 <55> 48 89 e5 65 44 8b 04 25 34$ Mar 4 01:51:01 server4 kernel: NMI backtrace for cpu 0 Mar 4 01:51:01 server4 kernel: CPU: 0 PID: 2091 Comm: php Not tainted 3.10.23-xxxx-std-ipv6-64 #1 Mar 4 01:51:01 server4 kernel: Hardware name: /DH67BL, BIOS BLH6710H.86A.0160.2012.1204.1156 12/04/2012 Mar 4 01:51:01 server4 kernel: task: ffff8807ebd8bba0 ti: ffff88066503e000 task.ti: ffff88066503e000 Mar 4 01:51:01 server4 kernel: RIP: 0010:[<ffffffff81151243>] [<ffffffff81151243>] get_vmalloc_info+0x63/0xe0 Mar 4 01:51:01 server4 kernel: RSP: 0018:ffff88066503fb88 EFLAGS: 00000287 Mar 4 01:51:01 server4 kernel: RAX: ffff8807eddcda80 RBX: ffff88066503fd80 RCX: ffffc8ffffffffff Mar 4 01:51:01 server4 kernel: RDX: ffffc90008934000 RSI: 0000000000021000 RDI: ffffc90007cc2000 Mar 4 01:51:01 server4 kernel: RBP: ffff88066503fb98 R08: ffffe8fffffffffe R09: 0000000000018000 Mar 4 01:51:01 server4 kernel: R10: 0000000000000001 R11: 0000000000000202 R12: 000000000042a14e Mar 4 01:51:01 server4 kernel: R13: 000000000035b072 R14: 000000000004712c R15: ffff880785132800 Mar 4 01:51:01 server4 kernel: FS: 00007f7e85629720(0000) GS:ffff88081f200000(0000) knlGS:0000000000000000 Mar 4 01:51:01 server4 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 4 01:51:01 server4 kernel: CR2: 00007f7e824fdc40 CR3: 00000007e7b15000 CR4: 00000000001407f0 Mar 4 01:51:01 server4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Mar 4 01:51:01 server4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Mar 4 01:51:01 server4 kernel: Stack: Mar 4 01:51:01 server4 kernel: 0000000000000000 00000000003366b3 ffff88066503fe58 ffffffff811d45dc Mar 4 01:51:01 server4 kernel: ffff88066503fbd8 ffffffff81190164 ffff8807f28b2780 ffff880632895900 Mar 4 01:51:01 server4 kernel: ffff8807e7a6d600 0000000180400028 ffff88066503fc18 ffffffff81190cf0 Mar 4 01:51:01 server4 kernel: Call Trace: Mar 4 01:51:01 server4 kernel: [<ffffffff811d45dc>] meminfo_proc_show+0xac/0x530 Mar 4 01:51:01 server4 kernel: [<ffffffff81190164>] ? seq_open+0x84/0x160 Mar 4 01:51:01 server4 kernel: [<ffffffff81190cf0>] ? single_open+0x60/0xb0 Mar 4 01:51:01 server4 kernel: [<ffffffff811615dc>] ? kmem_cache_free+0xec/0x100 Mar 4 01:51:01 server4 kernel: [<ffffffff8114bae3>] ? anon_vma_chain_free+0x13/0x20 Mar 4 01:51:01 server4 kernel: [<ffffffff8114d07e>] ? unlink_anon_vmas+0xce/0x1a0 Mar 4 01:51:01 server4 kernel: [<ffffffff81146906>] ? vma_gap_update+0x26/0x30 Mar 4 01:51:01 server4 kernel: [<ffffffff8114726d>] ? vma_adjust+0x3ad/0x660 Mar 4 01:51:01 server4 kernel: [<ffffffff811479fa>] ? vma_merge+0x2fa/0x320 Mar 4 01:51:01 server4 kernel: [<ffffffff81146b8f>] ? __vm_enough_memory+0x2f/0x180 Mar 4 01:51:01 server4 kernel: [<ffffffff81148eaa>] ? mmap_region+0x14a/0x5c0 Mar 4 01:51:01 server4 kernel: [<ffffffff8119037e>] seq_read+0x13e/0x380 Mar 4 01:51:01 server4 kernel: [<ffffffff8119037e>] seq_read+0x13e/0x380 Mar 4 01:51:01 server4 kernel: [<ffffffff811cd3b8>] proc_reg_read+0x38/0x70 Mar 4 01:51:01 server4 kernel: [<ffffffff811712c4>] vfs_read+0xa4/0x180 Mar 4 01:51:01 server4 kernel: [<ffffffff811717cd>] SyS_read+0x4d/0x90 Mar 4 01:51:01 server4 kernel: [<ffffffff81d41692>] system_call_fastpath+0x16/0x1b Mar 4 01:51:01 server4 kernel: Code: 00 00 4c 8b 4b 08 48 83 e8 30 48 bf 00 00 00 00 00 c9 ff ff 48 b9 ff ff ff ff ff c8 ff ff 49 b8 fe ff ff ff ff e8 ff ff 48 8b 10 <48> 39 ca 76 28 4c 39 c2 77 34$ Mar 4 01:51:01 server4 kernel: NMI backtrace for cpu 2 Mar 4 01:51:01 server4 kernel: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.10.23-xxxx-std-ipv6-64 #1 Mar 4 01:51:01 server4 kernel: Hardware name: /DH67BL, BIOS BLH6710H.86A.0160.2012.1204.1156 12/04/2012 Mar 4 01:51:01 server4 kernel: task: ffff8807f34bc240 ti: ffff8807f34f0000 task.ti: ffff8807f34f0000 Mar 4 01:51:01 server4 kernel: RIP: 0010:[<ffffffff81649d59>] [<ffffffff81649d59>] intel_idle+0xa9/0x100 Mar 4 01:51:01 server4 kernel: RSP: 0018:ffff8807f34f1dd8 EFLAGS: 00000046 Mar 4 01:51:01 server4 kernel: RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001 Mar 4 01:51:01 server4 kernel: RDX: 0000000000000000 RSI: ffff8807f34f1fd8 RDI: 0000000000000002 Mar 4 01:51:01 server4 kernel: RBP: ffff8807f34f1e08 R08: 0000000000000981 R09: 0000000000000010 Mar 4 01:51:01 server4 kernel: R10: 0000000000000f9c R11: 0000000000000000 R12: 0000000000000004 Mar 4 01:51:01 server4 kernel: R13: 0000000000000020 R14: 0000000000000003 R15: ffffffff821890b8 Mar 4 01:51:01 server4 kernel: FS: 0000000000000000(0000) GS:ffff88081f280000(0000) knlGS:0000000000000000 Mar 4 01:51:01 server4 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 4 01:51:01 server4 kernel: CR2: ffffffffff600400 CR3: 0000000002139000 CR4: 00000000001407e0 Mar 4 01:51:01 server4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Mar 4 01:51:01 server4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Mar 4 01:51:01 server4 kernel: Stack: 

今天系统日志文件以这个开始:

 Mar 5 06:25:08 server4 kernel: BUG: unable to handle kernel paging request at ffff800788fd1c30 Mar 5 06:25:08 server4 kernel: IP: [<ffffffff8119b22f>] __find_get_block_slow+0x9f/0x180 Mar 5 06:25:08 server4 kernel: PGD 0 Mar 5 06:25:08 server4 kernel: Oops: 0000 [#1] SMP Mar 5 06:25:08 server4 kernel: CPU: 2 PID: 32054 Comm: updatedb.mlocat Tainted: GB 3.10.23-xxxx-std-ipv6-64 #1 Mar 5 06:25:08 server4 kernel: Hardware name: /DH67BL, BIOS BLH6710H.86A.0160.2012.1204.1156 12/04/2012 Mar 5 06:25:08 server4 kernel: task: ffff8807ebf0b500 ti: ffff8807ee3ce000 task.ti: ffff8807ee3ce000 Mar 5 06:25:08 server4 kernel: RIP: 0010:[<ffffffff8119b22f>] [<ffffffff8119b22f>] __find_get_block_slow+0x9f/0x180 Mar 5 06:25:08 server4 kernel: RSP: 0018:ffff8807ee3cfa78 EFLAGS: 00010202 Mar 5 06:25:08 server4 kernel: RAX: 0000000000000001 RBX: 0000000008e02393 RCX: 0000000000000002 Mar 5 06:25:08 server4 kernel: RDX: ffff800788fd1c30 RSI: ffff880788d01b50 RDI: ffff8807eead82b8 Mar 5 06:25:08 server4 kernel: RBP: ffff8807ee3cfae8 R08: 0000000000000002 R09: ffffea001d94cd9c Mar 5 06:25:08 server4 kernel: R10: ffff880762a20020 R11: ffff8807fe803a00 R12: ffff8807eead80f0 Mar 5 06:25:08 server4 kernel: R13: ffff8807eead8230 R14: ffff800788fd1c30 R15: ffffea001d94cd80 Mar 5 06:25:08 server4 kernel: FS: 00007f7934b31700(0000) GS:ffff88081f280000(0000) knlGS:0000000000000000 Mar 5 06:25:08 server4 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 5 06:25:08 server4 kernel: CR2: ffff800788fd1c30 CR3: 00000007ece8c000 CR4: 00000000001407e0 Mar 5 06:25:08 server4 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Mar 5 06:25:08 server4 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Mar 5 06:25:08 server4 kernel: Stack: Mar 5 06:25:08 server4 kernel: ffff8807ee3cfa98 ffff8807eead8000 0000000008e02032 ffff8807eead80f0 Mar 5 06:25:08 server4 kernel: ffff8807ee3cfb18 ffffffff8119b258 ffffea0000000000 00000000de831424 Mar 5 06:25:08 server4 kernel: 3534000000215000 0000000000000000 ffff8807eead8000 0000000000001000 Mar 5 06:25:08 server4 kernel: Call Trace: 

然后继续在CPU上进行自我检测