连续的内核EIP错误

我收到以下错误时间。 每次进程都不一样。 你怎么看这个错误? 你有什么build议?

内核

Linux版本3.5.3(developer @ devel)(gcc版本4.1.2 20080704(Red Hat 4.1.2-50))#1 SMP

Aug 18 04:24:06 2013 kernel: [6349586.289190] Modules linked in: xt_iprange xt_pkttype xt_length xt_state xt_addrtype xt_set xt_LOG xt_tcpudp xt_connlimit xt_hashlimit xt_NFQUEUE xt_connmark xt_mark xt_multiport iptable_raw iptable_mangle iptable_nat nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack xt_recent iptable_filter ip_tables nfnetlink_queue rmd160 sha1_generic crypto_null camellia_generic lzo cast6 cast5 deflate zlib_deflate cts ctr gcm ccm serpent_generic blowfish_generic blowfish_common twofish_generic twofish_i586 twofish_common xcbc sha256_generic sha512_generic des_generic geode_aes aesni_intel cryptd aes_i586 xfrm_user ah6 ah4 esp6 esp4 xfrm4_mode_beet xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key xfrm_algo tun ipt_ULOG ip_set_hash_net ip_set nfnetlink x_tables cls_route cls_u32 cls_fw sch_sfq sch_htb bonding binfmt_misc raid1 video lp nvram evbug ixgbe mdio e1000e pcspkr serio_raw i7core_edac parport_pc edac_core parport lpc_ich mfd_core ioatdma tpm_tis tpm tpm_bios i2c_i801 dca microcode usb_storage [last unloaded: nf_conntrack] Aug 18 04:24:06 2013 kernel: [6349587.948936] Aug 18 04:24:06 2013 kernel: [6349587.950082] Pid: 25938, comm: rateup Tainted: GW 3.5.3 #1 Intel Thurley/Greencity Aug 18 04:24:06 2013 kernel: [6349588.915284] EIP: 0060:[<c04d3bab>] EFLAGS: 00010246 CPU: 2 Aug 18 04:24:06 2013 kernel: [6349588.949140] EIP is at bdi_position_ratio+0x15b/0x1e0 Aug 18 04:24:06 2013 kernel: [6349588.979872] EAX: 00236415 EBX: 00000000 ECX: 25448580 EDX: 00000000 Aug 18 04:24:06 2013 kernel: [6349589.018397] ESI: 00000000 EDI: 00236415 EBP: d2743cfc ESP: d2743cc4 Aug 18 04:24:06 2013 kernel: [6349589.056924] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Aug 18 04:24:06 2013 kernel: [6349589.090255] CR0: 80050033 CR2: 0967c000 CR3: 1205b000 CR4: 000007f0 Aug 18 04:24:06 2013 kernel: [6349589.128781] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 Aug 18 04:24:06 2013 kernel: [6349589.167305] DR6: ffff0ff0 DR7: 00000400 Aug 18 04:24:06 2013 kernel: [6349589.191295] Process rateup (pid: 25938, ti=d2742000 task=e3b4f110 task.ti=d2742000) Aug 18 04:24:06 2013 kernel: [6349589.238126] Stack: Aug 18 04:24:06 2013 kernel: [6349589.251209] d2743cec 0e38e200 00000000 00000000 00000000 00000021 00000010 00000001 Aug 18 04:24:06 2013 kernel: [6349589.539680] 0c6c2c80 000bcc07 00000000 00000000 00000000 00000000 d2743da4 c04d421e Aug 18 04:24:06 2013 kernel: [6349589.699231] 00000017 00000000 0000000c d2743d54 c05d9291 00000000 eb7f9614 ec140798 Aug 18 04:24:06 2013 kernel: [6349589.747052] Call Trace: Aug 18 04:24:06 2013 kernel: [6349589.762739] [<c04d421e>] balance_dirty_pages_ratelimited_nr+0x15e/0x6f0 Aug 18 04:24:06 2013 kernel: [6349589.803856] [<c05d9291>] ? journal_stop+0x121/0x290 Aug 18 04:24:06 2013 kernel: [6349589.834593] [<c05353e9>] ? __mark_inode_dirty+0x29/0x1c0 Aug 18 04:24:06 2013 kernel: [6349589.977634] [<c04cb3b8>] ? unlock_page+0x18/0x20 Aug 18 04:24:06 2013 kernel: [6349590.006812] [<c04cb0a5>] generic_file_buffered_write+0x165/0x210 Aug 18 04:24:06 2013 kernel: [6349590.044300] [<c04cd546>] __generic_file_aio_write+0x236/0x530 Aug 18 04:24:06 2013 kernel: [6349590.080230] [<c050f424>] ? __mem_cgroup_commit_charge+0x74/0x230 Aug 18 04:24:06 2013 kernel: [6349590.117716] [<c04d66d6>] ? lru_cache_add_lru+0x16/0x30 Aug 18 04:24:06 2013 kernel: [6349590.150009] [<c04f2a46>] ? page_add_new_anon_rmap+0x56/0x70 Aug 18 04:24:06 2013 kernel: [6349590.184901] [<c04cd892>] generic_file_aio_write+0x52/0xb0 Aug 18 04:24:06 2013 kernel: [6349590.218754] [<c05130cb>] do_sync_write+0xbb/0x100 Aug 18 04:24:06 2013 kernel: [6349590.248453] [<c0609649>] ? security_file_permission+0x19/0x90 Aug 18 04:24:06 2013 kernel: [6349590.284381] [<c051326d>] ? rw_verify_area+0x5d/0x110 Aug 18 04:24:06 2013 kernel: [6349590.315636] [<c05135b6>] vfs_write+0x96/0x160 Aug 18 04:24:06 2013 kernel: [6349590.343259] [<c0513010>] ? do_sync_readv_writev+0xd0/0xd0 Aug 18 04:24:06 2013 kernel: [6349590.377111] [<c0513dbd>] sys_write+0x3d/0x70 Aug 18 04:24:06 2013 kernel: [6349590.404215] [<c08bd48c>] sysenter_do_call+0x12/0x22 89 fa 77 08 89 f8 31 d2 <f7> f6 89 c3 89 c8 f7 f6 89 de 89 c3 8b 45 e4 d1 e8 39 45 10 73 Aug 18 04:24:06 2013 kernel: [6349590.552236] EIP: [<c04d3bab>] bdi_position_ratio+0x15b/0x1e0 SS:ESP 0068:d2743cc4 Aug 18 04:24:06 2013 kernel: [6349590.598936] ---[ end trace 4ea20832b85a6756 ]--- 

我会怀疑某种硬件故障; 坏RAM,坏CPU,别的东西。 你有没有试过运行memtest86

这是一个内核恐慌。 这类似于在windows-land中发生的BSOD。

如果你注意到这种情况一直发生,但是完全不可预知,每次都会有所不同,你几乎肯定会遇到硬件故障。 使用类似memtest86的东西,testing你的RAM,因为这是最可能的原因。 如果您有保修或支持合同,您应该与供应商打开电话。 它可以是主板或CPU,但也可以是任何组件。

如果您最近更新了内核,并且没有硬件故障,请恢复。 内核错误也可能导致这种情况,但是这些代码通常不会经过testing。

你有一个corrput内核,一个损坏的内核模块,或者需要刷新你的BIOS是远远可能的。