如何使用kdump /崩溃来调查OOM问题?

问题

一个服务器在多次“内存不足”消息后崩溃,我试图找出罪魁祸首。 如果是在userland – 哪个进程。 如果是在内核 – 哪个内核模块。

细节

我想了解如何使用崩溃实用程序来调查在服务器上触发OOM的内容。

作为安装一对新服务器的一部分,我开始了一个14TB DRBD设备的初始化。 大约在那个时候,在玩DRBD同步器速率configuration的同时,一些绑定的networking接口上下,其中一台服务器崩溃。 超过30秒的时间它产生了39 Out of memory: Kill process ####消息。 然后它坠毁了:

 Kernel panic - not syncing: Out of memory and no killable processes... 

系统崩溃触发了kdump 。 现在我有一个很好的vmcore.flat文件,应该直接用来调查这个问题,但是我很难找出所有的内存在哪里。

我知道唯一的资源是Dedoimedo的网站 ,它有很好的说明 ,和内核崩溃书 。 这些也恰好是答案中提出的唯一资源,所以我认为crash是调查的唯一方法。

如果还有另外一种方法对事件进行验尸,我愿意接受。 只是crash是我意识到的唯一效用。 我现在所有的是vmcore.flat文件,我只需要知道哪个组件吃掉了所有内存。 我怀疑一个内核模块的问题,更具体地说,其中一个绑定模块(因为它是在我把一个接口拉下来的时候触发的),DRBD模块(版本8.3.15是在CentOS 6.3上用树build立的),或者是其中一个10G以太网模块( mlnx_en是由树形结构构成的,也就是树形结构的bnx2x ,它是保持活动状态的接口)。 我只需要知道是否有办法来validation我的怀疑。

到目前为止,我只设法使用crash实用程序提取以下信息:

检查了使用了多less内存

 $ crash /usr/lib/debug/lib/modules/2.6.32-279.5.2.el6.x86_64/vmlinux vmcore.flat .... crash> kmem -i PAGES TOTAL PERCENTAGE TOTAL MEM 16482587 62.9 GB ---- FREE 54610 213.3 MB 0% of TOTAL MEM USED 16427977 62.7 GB 99% of TOTAL MEM SHARED 4683 18.3 MB 0% of TOTAL MEM BUFFERS 118 472 KB 0% of TOTAL MEM CACHED 82 328 KB 0% of TOTAL MEM SLAB 46635 182.2 MB 0% of TOTAL MEM TOTAL SWAP 0 0 ---- SWAP USED 0 0 100% of TOTAL SWAP SWAP FREE 0 0 0% of TOTAL SWAP 

显然,它耗尽了内存。 所有的64G都走了…但是在哪里?

试图看看是否有任何进程泄漏内存

似乎唯一相关的命令是ps (这是crashps子命令)。 它没有显示任何exception,但它也没有显示内核线程。

 crash> ps PID PPID CPU TASK ST %MEM VSZ RSS COMM 0 0 0 ffffffff81a8d020 RU 0.0 0 0 [swapper] > 0 0 1 ffff88102c456040 RU 0.0 0 0 [swapper] > 0 0 2 ffff88082c772aa0 RU 0.0 0 0 [swapper] > 0 0 3 ffff88102c456aa0 RU 0.0 0 0 [swapper] 0 0 4 ffff88082c7b8ae0 RU 0.0 0 0 [swapper] > 0 0 5 ffff88102c457500 RU 0.0 0 0 [swapper] > 0 0 6 ffff88082c7d6aa0 RU 0.0 0 0 [swapper] > 0 0 7 ffff88102c506080 RU 0.0 0 0 [swapper] > 0 0 8 ffff88082c016ae0 RU 0.0 0 0 [swapper] > 0 0 9 ffff88102c506ae0 RU 0.0 0 0 [swapper] > 0 0 10 ffff88082c05caa0 RU 0.0 0 0 [swapper] > 0 0 11 ffff88102c507540 RU 0.0 0 0 [swapper] > 0 0 12 ffff88082c09cae0 RU 0.0 0 0 [swapper] ..... 4926 1 5 ffff880828a38ae0 ?? 0.0 0 0 mingetty 4928 1 1 ffff88102a4e8040 ?? 0.0 0 0 mingetty 4930 1 19 ffff880827af4080 ?? 0.0 0 0 mingetty 4932 1 2 ffff88100f122040 ?? 0.0 0 0 mingetty 4934 1 18 ffff8810296ea080 ?? 0.0 0 0 mingetty 4936 1047 4 ffff880ff342d540 IN 0.0 11184 948 udevd 4937 1047 5 ffff88082a240080 IN 0.0 11184 948 udevd 5060 3772 2 ffff88082881d540 ?? 0.0 0 0 sshd 5078 1 1 ffff88100f060ae0 ?? 0.0 0 0 sshd 5079 1 1 ffff88082b882ae0 ?? 0.0 0 0 bash 

如果我拿出内核线程(无论如何都显示%MEM的零),我们可以看到,我几乎没有运行任何内容:

 crash> ps -u PID PPID CPU TASK ST %MEM VSZ RSS COMM 1 0 1 ffff88082c41b500 ?? 0.0 19348 348 init 1047 1 2 ffff881029524040 IN 0.0 11188 948 udevd 3171 1 3 ffff880826ccaaa0 IN 0.0 27636 240 auditd 3172 1 17 ffff881029d1b500 IN 0.0 27636 240 auditd > 3772 1 0 ffff88102b257500 RU 0.0 64072 668 sshd 4800 1 0 ffff88100f061540 ?? 0.0 0 0 dsm_om_shrsvcd 4842 1 16 ffff88100f012ae0 ?? 0.0 0 0 cmcld 4854 1 17 ffff88082a241540 ?? 0.0 0 0 cmlogd 4855 1 3 ffff88082796cae0 ?? 0.0 0 0 cmfileassistd 4856 1 18 ffff88082809d500 ?? 0.0 0 0 cmnetd 4860 1 0 ffff88082705aae0 ?? 0.0 0 0 cmresourced 4924 1 9 ffff88102a4e8aa0 ?? 0.0 0 0 mingetty 4926 1 5 ffff880828a38ae0 ?? 0.0 0 0 mingetty 4928 1 1 ffff88102a4e8040 ?? 0.0 0 0 mingetty 4930 1 19 ffff880827af4080 ?? 0.0 0 0 mingetty 4932 1 2 ffff88100f122040 ?? 0.0 0 0 mingetty 4934 1 18 ffff8810296ea080 ?? 0.0 0 0 mingetty 4936 1047 4 ffff880ff342d540 IN 0.0 11184 948 udevd 4937 1047 5 ffff88082a240080 IN 0.0 11184 948 udevd 5060 3772 2 ffff88082881d540 ?? 0.0 0 0 sshd 5078 1 1 ffff88100f060ae0 ?? 0.0 0 0 sshd 5079 1 1 ffff88082b882ae0 ?? 0.0 0 0 bash 5257 1 1 ffff8808279e6aa0 ?? 0.0 0 0 jnx_mlnxsnmp_da 

更新:

包括一些更多的产出,正如Soham所build议的那样。 不幸的是,我不能从中得出任何进一步的结论。 我能做的最好的是怀疑内核泄露内存中的东西,因为用户态进程几乎都是死的。

log -m的(几乎完整的)输出在这里

 crash> ps -G | tail -n +2 | cut -b2- | gawk '{mem += $8} END {print "total " mem/1048576 "GB"}' total 0.00391006GB 

请注意,目前几乎所有的用户级进程都是死的,所以报告的使用率很低。

内存不足的消息:

正如我上面提到的,有39个“内存不足”的消息,这里是:

 crash> log -m | grep Out <3>[ 223.556616] Out of memory: Kill process 3189 (portreserve) score 1 or sacrifice child <3>[ 223.787234] Out of memory: Kill process 3196 (rsyslogd) score 1 or sacrifice child <3>[ 224.237119] Out of memory: Kill process 3728 (dbus-daemon) score 1 or sacrifice child <3>[ 228.771770] Out of memory: Kill process 3758 (snmpd) score 1 or sacrifice child <3>[ 229.033466] Out of memory: Kill process 3782 (xinetd) score 1 or sacrifice child <3>[ 229.257710] Out of memory: Kill process 3782 (xinetd) score 1 or sacrifice child <3>[ 229.484321] Out of memory: Kill process 3782 (xinetd) score 1 or sacrifice child <3>[ 229.711169] Out of memory: Kill process 3782 (xinetd) score 1 or sacrifice child <3>[ 229.934955] Out of memory: Kill process 3801 (cmproxyd) score 1 or sacrifice child <3>[ 230.159542] Out of memory: Kill process 3812 (ntpd) score 1 or sacrifice child <3>[ 230.382083] Out of memory: Kill process 3953 (master) score 1 or sacrifice child <3>[ 230.606613] Out of memory: Kill process 3953 (master) score 1 or sacrifice child <3>[ 230.829515] Out of memory: Kill process 3953 (master) score 1 or sacrifice child <3>[ 230.832105] Out of memory: Kill process 3961 (crond) score 1 or sacrifice child <3>[ 236.749746] Out of memory: Kill process 3974 (atd) score 1 or sacrifice child <3>[ 236.969421] Out of memory: Kill process 4272 (dsm_sa_datamgrd) score 1 or sacrifice child <3>[ 237.192102] Out of memory: Kill process 4492 (dsm_sa_datamgrd) score 1 or sacrifice child <3>[ 237.746301] Out of memory: Kill process 4552 (dsm_sa_eventmgr) score 1 or sacrifice child <3>[ 237.968308] Out of memory: Kill process 4613 (dsm_sa_snmpd) score 1 or sacrifice child <3>[ 238.190550] Out of memory: Kill process 4614 (dsm_sa_snmpd) score 1 or sacrifice child <3>[ 238.644020] Out of memory: Kill process 4643 (dsm_om_connsvcd) score 1 or sacrifice child <3>[ 238.865658] Out of memory: Kill process 4643 (dsm_om_connsvcd) score 1 or sacrifice child <3>[ 251.285450] Out of memory: Kill process 4643 (dsm_om_connsvcd) score 1 or sacrifice child <3>[ 251.506601] Out of memory: Kill process 4800 (dsm_om_shrsvcd) score 1 or sacrifice child <3>[ 251.727570] Out of memory: Kill process 4842 (cmcld) score 1 or sacrifice child <3>[ 251.947085] Out of memory: Kill process 4842 (cmcld) score 1 or sacrifice child <3>[ 252.167096] Out of memory: Kill process 4854 (cmlogd) score 1 or sacrifice child <3>[ 252.384090] Out of memory: Kill process 4855 (cmfileassistd) score 1 or sacrifice child <3>[ 252.603324] Out of memory: Kill process 4924 (mingetty) score 1 or sacrifice child <3>[ 252.820757] Out of memory: Kill process 4926 (mingetty) score 1 or sacrifice child <3>[ 253.037558] Out of memory: Kill process 4928 (mingetty) score 1 or sacrifice child <3>[ 253.254908] Out of memory: Kill process 4930 (mingetty) score 1 or sacrifice child <3>[ 253.257391] Out of memory: Kill process 4932 (mingetty) score 1 or sacrifice child <3>[ 253.259357] Out of memory: Kill process 4934 (mingetty) score 1 or sacrifice child <3>[ 253.261353] Out of memory: Kill process 5060 (sshd) score 1 or sacrifice child <3>[ 253.263365] Out of memory: Kill process 5060 (sshd) score 1 or sacrifice child <3>[ 253.264392] Out of memory: Kill process 5079 (bash) score 1 or sacrifice child <3>[ 253.266352] Out of memory: Kill process 5257 (jnx_mlnxsnmp_da) score 1 or sacrifice child <0>[ 253.529344] Kernel panic - not syncing: Out of memory and no killable processes... 

sys输出:

 crash> sys KERNEL: /usr/lib/debug/lib/modules/2.6.32-279.5.2.el6.x86_64/vmlinux DUMPFILE: pcdata03.vmcore.flat [PARTIAL DUMP] CPUS: 32 DATE: Wed Feb 6 02:11:52 2013 UPTIME: 00:04:12 LOAD AVERAGE: 3.03, 0.95, 0.34 TASKS: 578 NODENAME: .... RELEASE: 2.6.32-279.5.2.el6.x86_64 VERSION: #1 SMP Fri Aug 24 01:07:11 UTC 2012 MACHINE: x86_64 (2700 Mhz) MEMORY: 64 GB PANIC: "[ 253.529344] Kernel panic - not syncing: Out of memory and no killable processes..." 

kmem -z

 crash> kmem -z NODE: 0 ZONE: 0 ADDR: ffff88000000a0c0 NAME: "DMA" SIZE: 4095 PRESENT: 3839 MIN/LOW/HIGH: 5/6/7 VM_STAT: NR_FREE_PAGES: 3936 NR_INACTIVE_ANON: 0 NR_ACTIVE_ANON: 0 NR_INACTIVE_FILE: 0 NR_ACTIVE_FILE: 0 NR_UNEVICTABLE: 0 NR_MLOCK: 0 NR_ANON_PAGES: 0 NR_FILE_MAPPED: 0 NR_FILE_PAGES: 0 NR_FILE_DIRTY: 0 NR_WRITEBACK: 0 NR_SLAB_RECLAIMABLE: 0 NR_SLAB_UNRECLAIMABLE: 0 NR_PAGETABLE: 0 NR_KERNEL_STACK: 0 NR_UNSTABLE_NFS: 0 NR_BOUNCE: 0 NR_VMSCAN_WRITE: 0 NR_VMSCAN_IMMEDIATE: 0 NR_WRITEBACK_TEMP: 0 NR_ISOLATED_ANON: 0 NR_ISOLATED_FILE: 0 NR_SHMEM: 0 NUMA_HIT: 0 NUMA_MISS: 0 NUMA_FOREIGN: 0 NUMA_INTERLEAVE_HIT: 0 NUMA_LOCAL: 0 NUMA_OTHER: 0 NR_ANON_TRANSPARENT_HUGEPAGES: 0 NODE: 0 ZONE: 1 ADDR: ffff880000012780 NAME: "DMA32" SIZE: 1044480 PRESENT: 756520 MIN/LOW/HIGH: 1030/1287/1545 VM_STAT: NR_FREE_PAGES: 30117 NR_INACTIVE_ANON: 0 NR_ACTIVE_ANON: 0 NR_INACTIVE_FILE: 1 NR_ACTIVE_FILE: 0 NR_UNEVICTABLE: 0 NR_MLOCK: 0 NR_ANON_PAGES: 0 NR_FILE_MAPPED: 0 NR_FILE_PAGES: 1 NR_FILE_DIRTY: 0 NR_WRITEBACK: 0 NR_SLAB_RECLAIMABLE: 4 NR_SLAB_UNRECLAIMABLE: 4150 NR_PAGETABLE: 0 NR_KERNEL_STACK: 0 NR_UNSTABLE_NFS: 0 NR_BOUNCE: 0 NR_VMSCAN_WRITE: 0 NR_VMSCAN_IMMEDIATE: 0 NR_WRITEBACK_TEMP: 0 NR_ISOLATED_ANON: 0 NR_ISOLATED_FILE: 0 NR_SHMEM: 0 NUMA_HIT: 575606 NUMA_MISS: 3 NUMA_FOREIGN: 0 NUMA_INTERLEAVE_HIT: 0 NUMA_LOCAL: 575598 NUMA_OTHER: 11 NR_ANON_TRANSPARENT_HUGEPAGES: 0 NODE: 0 ZONE: 2 ADDR: ffff88000001ae40 NAME: "Normal" SIZE: 7602176 PRESENT: 7498240 MIN/LOW/HIGH: 10217/12771/15325 VM_STAT: NR_FREE_PAGES: 10443 NR_INACTIVE_ANON: 134 NR_ACTIVE_ANON: 197 NR_INACTIVE_FILE: -47 NR_ACTIVE_FILE: 42 NR_UNEVICTABLE: 0 NR_MLOCK: 0 NR_ANON_PAGES: 219 NR_FILE_MAPPED: 115 NR_FILE_PAGES: 45 NR_FILE_DIRTY: 0 NR_WRITEBACK: 0 NR_SLAB_RECLAIMABLE: 908 NR_SLAB_UNRECLAIMABLE: 18771 NR_PAGETABLE: 91 NR_KERNEL_STACK: 556 NR_UNSTABLE_NFS: 0 NR_BOUNCE: 0 NR_VMSCAN_WRITE: 0 NR_VMSCAN_IMMEDIATE: 0 NR_WRITEBACK_TEMP: 0 NR_ISOLATED_ANON: 0 NR_ISOLATED_FILE: 0 NR_SHMEM: 34 NUMA_HIT: 8243991 NUMA_MISS: 648 NUMA_FOREIGN: 4593726 NUMA_INTERLEAVE_HIT: 20066 NUMA_LOCAL: 8243829 NUMA_OTHER: 810 NR_ANON_TRANSPARENT_HUGEPAGES: 0 NODE: 0 ZONE: 3 ADDR: ffff880000023500 NAME: "Movable" [unpopulated] NODE: 1 ZONE: 0 ADDR: ffff880840000040 NAME: "DMA" [unpopulated] NODE: 1 ZONE: 1 ADDR: ffff880840008700 NAME: "DMA32" [unpopulated] NODE: 1 ZONE: 2 ADDR: ffff880840010dc0 NAME: "Normal" SIZE: 8388608 PRESENT: 8273920 MIN/LOW/HIGH: 11274/14092/16911 VM_STAT: NR_FREE_PAGES: 10114 NR_INACTIVE_ANON: 417 NR_ACTIVE_ANON: 83 NR_INACTIVE_FILE: 47 NR_ACTIVE_FILE: 32 NR_UNEVICTABLE: 0 NR_MLOCK: 0 NR_ANON_PAGES: 436 NR_FILE_MAPPED: 22 NR_FILE_PAGES: 154 NR_FILE_DIRTY: 0 NR_WRITEBACK: 0 NR_SLAB_RECLAIMABLE: 863 NR_SLAB_UNRECLAIMABLE: 21939 NR_PAGETABLE: 134 NR_KERNEL_STACK: 27 NR_UNSTABLE_NFS: 0 NR_BOUNCE: 0 NR_VMSCAN_WRITE: 3 NR_VMSCAN_IMMEDIATE: 5 NR_WRITEBACK_TEMP: 0 NR_ISOLATED_ANON: 0 NR_ISOLATED_FILE: 23 NR_SHMEM: 20 NUMA_HIT: 4332488 NUMA_MISS: 4593726 NUMA_FOREIGN: 665 NUMA_INTERLEAVE_HIT: 20007 NUMA_LOCAL: 4309300 NUMA_OTHER: 4616914 NR_ANON_TRANSPARENT_HUGEPAGES: 0 NODE: 1 ZONE: 3 ADDR: ffff880840019480 NAME: "Movable" [unpopulated] 

kmem -f

 crash> kmem -f NODE 0 ZONE NAME SIZE FREE MEM_MAP START_PADDR START_MAPNR 0 DMA 4095 3936 ffffea0000000038 1000 0 AREA SIZE FREE_AREA_STRUCT BLOCKS PAGES 0 4k ffff880000012128 2 2 0 4k ffff880000012138 0 0 0 4k ffff880000012148 0 0 0 4k ffff880000012158 0 0 0 4k ffff880000012168 0 0 1 8k ffff880000012180 1 2 1 8k ffff880000012190 0 0 1 8k ffff8800000121a0 0 0 1 8k ffff8800000121b0 0 0 1 8k ffff8800000121c0 0 0 2 16k ffff8800000121d8 1 4 2 16k ffff8800000121e8 0 0 2 16k ffff8800000121f8 0 0 2 16k ffff880000012208 0 0 2 16k ffff880000012218 0 0 3 32k ffff880000012230 1 8 3 32k ffff880000012240 0 0 3 32k ffff880000012250 0 0 3 32k ffff880000012260 0 0 3 32k ffff880000012270 0 0 4 64k ffff880000012288 1 16 4 64k ffff880000012298 0 0 4 64k ffff8800000122a8 0 0 4 64k ffff8800000122b8 0 0 4 64k ffff8800000122c8 0 0 5 128k ffff8800000122e0 0 0 5 128k ffff8800000122f0 0 0 5 128k ffff880000012300 0 0 5 128k ffff880000012310 0 0 5 128k ffff880000012320 0 0 6 256k ffff880000012338 1 64 6 256k ffff880000012348 0 0 6 256k ffff880000012358 0 0 6 256k ffff880000012368 0 0 6 256k ffff880000012378 0 0 7 512k ffff880000012390 0 0 7 512k ffff8800000123a0 0 0 7 512k ffff8800000123b0 0 0 7 512k ffff8800000123c0 0 0 7 512k ffff8800000123d0 0 0 8 1024k ffff8800000123e8 1 256 8 1024k ffff8800000123f8 0 0 8 1024k ffff880000012408 0 0 8 1024k ffff880000012418 0 0 8 1024k ffff880000012428 0 0 9 2048k ffff880000012440 0 0 9 2048k ffff880000012450 0 0 9 2048k ffff880000012460 0 0 9 2048k ffff880000012470 1 512 9 2048k ffff880000012480 0 0 10 4096k ffff880000012498 0 0 10 4096k ffff8800000124a8 0 0 10 4096k ffff8800000124b8 3 3072 10 4096k ffff8800000124c8 0 0 10 4096k ffff8800000124d8 0 0 ZONE NAME SIZE FREE MEM_MAP START_PADDR START_MAPNR 1 DMA32 1044480 30117 ffffea0000038000 1000000 4095 AREA SIZE FREE_AREA_STRUCT BLOCKS PAGES 0 4k ffff88000001a7e8 24 24 0 4k ffff88000001a7f8 4 4 0 4k ffff88000001a808 13 13 0 4k ffff88000001a818 0 0 0 4k ffff88000001a828 0 0 1 8k ffff88000001a840 2 4 1 8k ffff88000001a850 2 4 1 8k ffff88000001a860 4 8 1 8k ffff88000001a870 0 0 1 8k ffff88000001a880 0 0 2 16k ffff88000001a898 0 0 2 16k ffff88000001a8a8 3 12 2 16k ffff88000001a8b8 4 16 2 16k ffff88000001a8c8 0 0 2 16k ffff88000001a8d8 0 0 3 32k ffff88000001a8f0 0 0 3 32k ffff88000001a900 3 24 3 32k ffff88000001a910 3 24 3 32k ffff88000001a920 0 0 3 32k ffff88000001a930 0 0 4 64k ffff88000001a948 1 16 4 64k ffff88000001a958 3 48 4 64k ffff88000001a968 6 96 4 64k ffff88000001a978 0 0 4 64k ffff88000001a988 0 0 5 128k ffff88000001a9a0 0 0 5 128k ffff88000001a9b0 3 96 5 128k ffff88000001a9c0 7 224 5 128k ffff88000001a9d0 0 0 5 128k ffff88000001a9e0 0 0 6 256k ffff88000001a9f8 0 0 6 256k ffff88000001aa08 1 64 6 256k ffff88000001aa18 6 384 6 256k ffff88000001aa28 0 0 6 256k ffff88000001aa38 0 0 7 512k ffff88000001aa50 1 128 7 512k ffff88000001aa60 0 0 7 512k ffff88000001aa70 8 1024 7 512k ffff88000001aa80 0 0 7 512k ffff88000001aa90 0 0 8 1024k ffff88000001aaa8 1 256 8 1024k ffff88000001aab8 1 256 8 1024k ffff88000001aac8 5 1280 8 1024k ffff88000001aad8 0 0 8 1024k ffff88000001aae8 0 0 9 2048k ffff88000001ab00 0 0 9 2048k ffff88000001ab10 1 512 9 2048k ffff88000001ab20 3 1536 9 2048k ffff88000001ab30 1 512 9 2048k ffff88000001ab40 0 0 10 4096k ffff88000001ab58 0 0 10 4096k ffff88000001ab68 0 0 10 4096k ffff88000001ab78 22 22528 10 4096k ffff88000001ab88 1 1024 10 4096k ffff88000001ab98 0 0 ZONE NAME SIZE FREE MEM_MAP START_PADDR START_MAPNR 2 Normal 7602176 10443 ffffea0003800000 100000000 1048575 AREA SIZE FREE_AREA_STRUCT BLOCKS PAGES 0 4k ffff880000022ea8 365 365 0 4k ffff880000022eb8 274 274 0 4k ffff880000022ec8 274 274 0 4k ffff880000022ed8 0 0 0 4k ffff880000022ee8 0 0 1 8k ffff880000022f00 99 198 1 8k ffff880000022f10 94 188 1 8k ffff880000022f20 360 720 1 8k ffff880000022f30 0 0 1 8k ffff880000022f40 0 0 2 16k ffff880000022f58 30 120 2 16k ffff880000022f68 41 164 2 16k ffff880000022f78 204 816 2 16k ffff880000022f88 0 0 2 16k ffff880000022f98 0 0 3 32k ffff880000022fb0 9 72 3 32k ffff880000022fc0 19 152 3 32k ffff880000022fd0 138 1104 3 32k ffff880000022fe0 0 0 3 32k ffff880000022ff0 0 0 4 64k ffff880000023008 7 112 4 64k ffff880000023018 4 64 4 64k ffff880000023028 77 1232 4 64k ffff880000023038 0 0 4 64k ffff880000023048 0 0 5 128k ffff880000023060 3 96 5 128k ffff880000023070 3 96 5 128k ffff880000023080 43 1376 5 128k ffff880000023090 0 0 5 128k ffff8800000230a0 0 0 6 256k ffff8800000230b8 0 0 6 256k ffff8800000230c8 0 0 6 256k ffff8800000230d8 13 832 6 256k ffff8800000230e8 0 0 6 256k ffff8800000230f8 0 0 7 512k ffff880000023110 0 0 7 512k ffff880000023120 0 0 7 512k ffff880000023130 5 640 7 512k ffff880000023140 0 0 7 512k ffff880000023150 0 0 8 1024k ffff880000023168 0 0 8 1024k ffff880000023178 0 0 8 1024k ffff880000023188 0 0 8 1024k ffff880000023198 0 0 8 1024k ffff8800000231a8 0 0 9 2048k ffff8800000231c0 0 0 9 2048k ffff8800000231d0 0 0 9 2048k ffff8800000231e0 1 512 9 2048k ffff8800000231f0 0 0 9 2048k ffff880000023200 0 0 10 4096k ffff880000023218 0 0 10 4096k ffff880000023228 0 0 10 4096k ffff880000023238 0 0 10 4096k ffff880000023248 1 1024 10 4096k ffff880000023258 0 0 ZONE NAME SIZE FREE MEM_MAP START_PADDR START_MAPNR 3 Movable 0 0 0 0 0 -------------------------------------------------------------------------- NODE 1 ZONE NAME SIZE FREE MEM_MAP START_PADDR START_MAPNR 0 DMA 0 0 0 0 0 ZONE NAME SIZE FREE MEM_MAP START_PADDR START_MAPNR 1 DMA32 0 0 0 0 0 ZONE NAME SIZE FREE MEM_MAP START_PADDR START_MAPNR 2 Normal 8388608 10114 ffffea001ce00000 840000000 0 AREA SIZE FREE_AREA_STRUCT BLOCKS PAGES 0 4k ffff880840018e28 405 405 0 4k ffff880840018e38 162 162 0 4k ffff880840018e48 317 317 0 4k ffff880840018e58 0 0 0 4k ffff880840018e68 0 0 1 8k ffff880840018e80 106 212 1 8k ffff880840018e90 70 140 1 8k ffff880840018ea0 269 538 1 8k ffff880840018eb0 0 0 1 8k ffff880840018ec0 0 0 2 16k ffff880840018ed8 24 96 2 16k ffff880840018ee8 18 72 2 16k ffff880840018ef8 207 828 2 16k ffff880840018f08 0 0 2 16k ffff880840018f18 0 0 3 32k ffff880840018f30 20 160 3 32k ffff880840018f40 4 32 3 32k ffff880840018f50 148 1184 3 32k ffff880840018f60 0 0 3 32k ffff880840018f70 0 0 4 64k ffff880840018f88 17 272 4 64k ffff880840018f98 2 32 4 64k ffff880840018fa8 95 1520 4 64k ffff880840018fb8 0 0 4 64k ffff880840018fc8 0 0 5 128k ffff880840018fe0 4 128 5 128k ffff880840018ff0 1 32 5 128k ffff880840019000 37 1184 5 128k ffff880840019010 0 0 5 128k ffff880840019020 0 0 6 256k ffff880840019038 0 0 6 256k ffff880840019048 0 0 6 256k ffff880840019058 8 512 6 256k ffff880840019068 0 0 6 256k ffff880840019078 0 0 7 512k ffff880840019090 0 0 7 512k ffff8808400190a0 0 0 7 512k ffff8808400190b0 1 128 7 512k ffff8808400190c0 0 0 7 512k ffff8808400190d0 0 0 8 1024k ffff8808400190e8 0 0 8 1024k ffff8808400190f8 0 0 8 1024k ffff880840019108 1 256 8 1024k ffff880840019118 0 0 8 1024k ffff880840019128 0 0 9 2048k ffff880840019140 0 0 9 2048k ffff880840019150 0 0 9 2048k ffff880840019160 1 512 9 2048k ffff880840019170 1 512 9 2048k ffff880840019180 0 0 10 4096k ffff880840019198 0 0 10 4096k ffff8808400191a8 0 0 10 4096k ffff8808400191b8 0 0 10 4096k ffff8808400191c8 1 1024 10 4096k ffff8808400191d8 0 0 ZONE NAME SIZE FREE MEM_MAP START_PADDR START_MAPNR 3 Movable 0 0 0 0 0 nr_free_pages: 54610 (found 54742) 

foreach bt一点练习

 crash> foreach bt | awk '$1 == "#0" { $2 = ""; print }' | sort | uniq -c 31 #0 crash_nmi_callback at ffffffff81029df6 1 #0 machine_kexec at ffffffff8103281b 546 #0 schedule at ffffffff814fda62 

事实上,他们要么坠毁,要么等待记忆(或者我没有正确阅读)。

    检查物理内存最大的20个最大消费者(居民套装大小)。

     crash> ps -G | sed 's/>//g' | sort -k 8,8 -n | awk '$8 ~ /[0-9]/{ $8 = $8/1024" MB"; print }' | tail -20 

    要检查大量页面的数量。

     crash> p -d nr_huge_pages 

    更新:

    A)从下面的内核版本捕获崩溃转储。

     $ crash --osrelease vmcore.flat 2.6.32-279.5.2.el6.x86_64 

    B)让我们从kernel-debug-debuginfo包中提取vmlinux文件。

     $ rpm2cpio kernel-debug-debuginfo-2.6.32-279.5.2.el6.x86_64.rpm | \ cpio -idv ./usr/lib/debug/lib/modules/*/vmlinux 

    C)使用崩溃工具打开vmcore文件。

     $ bunzip2 vmcore.flat.bz2 $ crash vmcore.flat ./usr/lib/debug/lib/modules/2.6.32-279.5.2.el6.x86_64/vmlinux 

    D)系统信息。

     crash> sys   KERNEL: ./usr/lib/debug/lib/modules/2.6.32-279.5.2.el6.x86_64/vmlinux  DUMPFILE: vmcore.flat  [PARTIAL DUMP]    CPUS: 32    DATE: Tue Feb  5 12:11:52 2013   UPTIME: 00:04:12 LOAD AVERAGE: 3.03, 0.95, 0.34    TASKS: 578  NODENAME: ...   RELEASE: 2.6.32-279.5.2.el6.x86_64   VERSION: #1 SMP Fri Aug 24 01:07:11 UTC 2012   MACHINE: x86_64  (2700 Mhz)   MEMORY: 64 GB    PANIC: "[  253.529344] Kernel panic - not syncing: Out of memory and no killable processes..." 

    a)由于内存不足而发生恐慌,但在系统上禁用“panic_on_oom”参数。

     crash> p -d sysctl_panic_on_oom sysctl_panic_on_oom = $6 = 0 

    此参数启用或禁用内存不足function的恐慌。 如果设置为0,内核将会杀死一些被称为oom_killer的stream氓进程。 通常情况下,oom_killer可以杀死stream氓程序,系统将会存活。 如果设置为1,则发生内存不足时内核发生混乱。

    b)那么,在oom事件发生时我们如何捕获vmcore呢?

    那么我们来看看mm / oom_kill.c的源代码。 它说,如果系统上没有任何东西被杀死,那么简单地挂起或者恐慌。

     ++++++ 499 /* Found nothing?!?! Either we hang forever, or we panic. */  500 if (!p) {                         501 read_unlock(&tasklist_lock);                   502 cpuset_unlock();                    503 panic("Out of memory and no killable processes...\n");  <<<------ 504 }                            505 ++++++ 

    所以我们达到了恐慌状态,并且由于在此系统上启用了kdump服务,vmcore被捕获了。

    E)让我们检查内核环缓冲区,

     crash> log [..] [  253.351427] Node 0 DMA free:15744kB min:20kB low:24kB high:28kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15356kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [  253.352234] lowmem_reserve[]: 0 2955 32245 32245 [  253.352812] Node 0 DMA32 free:120436kB min:4120kB low:5148kB high:6180kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:32kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3026080kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:20kB slab_unreclaimable:16600kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:1 all_unreclaimable? no [  253.353637] lowmem_reserve[]: 0 0 29290 29290 [  253.354216] Node 0 Normal free:40580kB min:40868kB low:51084kB high:61300kB active_anon:956kB inactive_anon:536kB active_file:260kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:29992960kB mlocked:0kB dirty:0kB writeback:0kB mapped:460kB shmem:136kB slab_reclaimable:3640kB slab_unreclaimable:75128kB kernel_stack:4448kB pagetables:428kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [  253.355047] lowmem_reserve[]: 0 0 0 0 [  253.355624] Node 1 Normal free:39896kB min:45096kB low:56368kB high:67644kB active_anon:412kB inactive_anon:1668kB active_file:288kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):220kB present:33095680kB mlocked:0kB dirty:0kB writeback:0kB mapped:92kB shmem:80kB slab_reclaimable:3496kB slab_unreclaimable:87864kB kernel_stack:216kB pagetables:564kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [  253.356457] lowmem_reserve[]: 0 0 0 0 [  253.357034] Node 0 DMA: 2*4kB 1*8kB 1*16kB 1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15744kB [  253.358351] Node 0 DMA32: 41*4kB 8*8kB 7*16kB 6*32kB 10*64kB 10*128kB 7*256kB 9*512kB 7*1024kB 5*2048kB 23*4096kB = 120468kB [  253.359674] Node 0 Normal: 718*4kB 558*8kB 278*16kB 169*32kB 88*64kB 47*128kB 13*256kB 5*512kB 0*1024kB 1*2048kB 1*4096kB = 40872kB [  253.360995] Node 1 Normal: 876*4kB 447*8kB 249*16kB 174*32kB 116*64kB 40*128kB 8*256kB 1*512kB 1*1024kB 2*2048kB 1*4096kB = 40952kB [  253.362319] 154 total pagecache pages [  253.362502] 0 pages in swap cache [  253.362684] Swap cache stats: add 0, delete 0, find 0/0 [  253.362869] Free swap  = 0kB [  253.363050] Total swap = 0kB [  253.526814] 16777215 pages RAM [  253.526999] 294628 pages reserved [  253.527190] 114911 pages shared [  253.527372] 16392561 pages non-shared [..] 

    F)让我们检查崩溃时系统的内存状态。

     crash> kmem -i       PAGES     TOTAL    PERCENTAGE TOTAL MEM  16482587    62.9 GB     ----     -------------------------------+   FREE   54610   213.3 MB   0% of TOTAL MEM                  |   USED  16427977    62.7 GB  99% of TOTAL MEM                 |  SHARED   4683    18.3 MB   0% of TOTAL MEM                  |  BUFFERS    118    472 KB   0% of TOTAL MEM                  |  CACHED    82    328 KB   0% of TOTAL MEM                  |   SLAB   46635   182.2 MB   0% of TOTAL MEM                  |                                           | TOTAL SWAP     0       0     ----     ----------------------+     | SWAP USED     0       0  100% of TOTAL SWAP             |     | SWAP FREE     0       0   0% of TOTAL SWAP             |     |                                      |     |                                      |     | crash> p -d totalram_pages                          |     | totalram_pages = $5 = 16482587                        |     |                                      |     | crash> !echo "scale=5;(16482587*4096)/2^30"|bc -q              |     | 62.87607          <<<-----[ Total physical memory is 62.9 GB ] <<<--|--------+                                      | crash> p -d total_swap_pages                         | total_swap_pages = $6 = 0 <<<------[ No Swap on the system ]  <<<-----------+ 
    • 我们总共有〜63GiB的物理内存。
    • 交换分区或文件不是在系统上创build的,所以我们在这个服务器上没有交换。
    • 用于caching的内存非常less,328KB,缓冲区是472KB。
    • 内存使用的板坯也很less,只有182.2 MB。

    G)分配给进程(es)的总内存是0.00391006GiB。

     crash> ps -G | tail -n +2 | cut -b2- | gawk '{mem += $8} END {print "total " mem/1048576 "GB"}' total 0.00391006GB 

    H)应用程序没有使用系统上的内存。

     crash> ps -G | sed 's/>//g' | sort -k 8,8 -n | awk '$8 ~ /[0-9]/{ $8 = $8/1024" MB"; print }' | tail -20 965 2 21 ffff8808292f1500 IN 0.0 0 0 MB [ext4-dio-unwrit] 966 2 22 ffff8808292d4080 IN 0.0 0 0 MB [ext4-dio-unwrit] 967 2 23 ffff8808292ce040 IN 0.0 0 0 MB [ext4-dio-unwrit] 968 2 24 ffff8808299b5540 IN 0.0 0 0 MB [ext4-dio-unwrit] 969 2 25 ffff880829aa6040 IN 0.0 0 0 MB [ext4-dio-unwrit] 970 2 26 ffff880827367500 IN 0.0 0 0 MB [ext4-dio-unwrit] 971 2 27 ffff880827366aa0 IN 0.0 0 0 MB [ext4-dio-unwrit] 972 2 28 ffff880827366040 IN 0.0 0 0 MB [ext4-dio-unwrit] 97 2 23 ffff88082c1ac080 IN 0.0 0 0 MB [ksoftirqd/23] 973 2 29 ffff880827371540 IN 0.0 0 0 MB [ext4-dio-unwrit] 974 2 30 ffff880827370ae0 IN 0.0 0 0 MB [ext4-dio-unwrit] 975 2 31 ffff880827370080 IN 0.0 0 0 MB [ext4-dio-unwrit] 98 2 23 ffff88082c1bb500 IN 0.0 0 0 MB [watchdog/23] 99 2 24 ffff88082c1baaa0 IN 0.0 0 0 MB [migration/24] 3171 1 3 ffff880826ccaaa0 IN 0.0 27636 0.234375 MB auditd 1 0 1 ffff88082c41b500 UN 0.0 19348 0.339844 MB init 3772 1 0 ffff88102b257500 RU 0.0 64072 0.652344 MB sshd 1047 1 2 ffff881029524040 IN 0.0 11188 0.925781 MB udevd 4936 1047 4 ffff880ff342d540 IN 0.0 11184 0.925781 MB udevd 4937 1047 5 ffff88082a240080 IN 0.0 11184 0.925781 MB udevd 

    I)让我们validation系统上的内存调整参数。

     crash> p -d sysctl_overcommit_memory sysctl_overcommit_memory = $7 = 0 

    该值包含一个启用内存过度分配的标志。 当这个标志为0时,内核尝试估计当用户空间请求更多内存时剩余的可用内存量。

     crash> p -d sysctl_overcommit_ratio sysctl_overcommit_ratio = $8 = 50 

    当overcommit_memory设置为2时,提交的地址空间不允许超过swap加上这个百分比的物理RAM。

     crash> p -d zone_reclaim_mode zone_reclaim_mode = $4 = 0 

    Zone_reclaim_mode允许有人设置更多或更less的侵略性方法,以便在区域内存不足时回收内存。 如果设置为零,则不会发生区域回收。

     crash> p -d min_free_kbytes min_free_kbytes = $3 = 90112  <<<--------[ 88 MB ] 

    在系统中保持自由的最小数量是千字节。 该值用于计算每个低内存区域的水印值,然后为其分配与其大小成比例的多个保留​​的空闲页面。 设置此参数时,由于太低和太高的值都可能造成损坏。

    换句话说,将min_free_kbytes设置得太低会阻止系统回收内存。 这可能会导致系统挂起和OOM杀死多个进程。 但是,将此参数设置为过高的值(系统总内存的5-10%)将导致系统立即变为内存不足。 Linux旨在使用所有可用的RAM来caching文件系统数据。 设置较高的min_free_kbytes值会导致系统花费太多时间来回收内存。

    以上参数的值看起来不错,那么我的内存在哪里?

    假设:

    1. 主要的罪犯不在用户空间中。 根据我的经验,不负责任的内存是由于Mellanox和DRBD模块,但我不确定你的情况。
    2. 由于大多数页面从vmcore文件中被放弃,以减小vmcore文件(core_collector makedumpfile -d 31 -c)的大小。 我无法检查巨大的页面大小。

    你可以运行这个命令吗?

     ps -G | tail -n +2 | cut -b2- | gawk '{mem += $8} END {Print "total " mem/1048576 "GB"}' 

    另外,kmem -z和kmem -f可能会有所帮助。

    但是,请参阅整个交换空间被消耗。

    您必须在日志命令中获得一些OOM消息。 你可以pastebin的OOM消息。 我也可以看到模式和竞赛状况。

    另外注意,一个sys输出将会非常有帮助。 你知道,内存泄漏和内核升级几乎是同义词;)

    编辑:自己试试。 foreach bt

    它会显示每个PID的轨迹。 寻找任何常见的模式,可能他们都在等待schedule_at函数。 意味着等待内存分配。

    现在,请注意显示每个PID在foreach bt的部分中的TASK的值。

    运行这个。

     tasK_struct.tgid <TASK> 

    如果它们来自相同的PID,则看到的进程是线程化的。

    我忘记了如何检查过度使用的价值,我会看看我能否find答案。

    我没有太多的分析崩溃转储的经验,所以不能帮助你一些具体的build议,但这里有一些我不时收集和使用的链接。 也许你会发现有用的东西:

    • Linux内核崩溃书
    • 让linux kdump与debugging内核和崩溃实用程序一起工作
    • Collecting and analyzing Linux kernel crashes – crash

    You may know this and have simply not indicated this is what you did, but, there's lots of options to ps.

     # ps aux 

    要么

     # ps -edf 

    as root will spit much more detailed information.

    There's a great deal of helpful suggestions on this page to help track down memory issues:

    http://www.linuxnix.com/2011/05/find-ram-utilization-user-linux.html

    I'd check

     # free 

    periodically to see how much memory is actually being used. (Or better yet, have it graphed; graphite or munin are great for this) so that you can visualize when/how your memory is being used.

    64 gigs of ram is quite a bit; what sort of work does the host do?