服务器重载,就像内存不足,但那不是事实

我有一台运行QEMU-KVM虚拟化的Centos 6.5服务器:

硬件:

  • 40个CPU
  • 400 GB RAM

软件:

  • 内核:2.6.32-431.17.1.el6.x86_64
  • Qemu:0.12.1.2
  • Libvirt:0.10.2

有3位客人,具有相同的硬件configuration:

  • 16个CPU
  • 120 GB RAM

<memory unit ='KiB'> 125829120 </ memory>
<currentMemory unit ='KiB'> 125829120 </ currentMemory>
<vcpu placement ='static'> 16 </ vcpu>

客户正在运行Apache和MySQL。

在主机上只运行虚拟机旁边的一些备份和维护脚本,没有别的。

总是在几天运行后,问题开始显现。 客人的负担随机高达150左右,盗取CPU时间10-15%。 在主机上的负载大约是38-40,用户CPU时间大约30-40%,系统CPU时间40-50%。

当时主机上CPU消耗最多的进程是虚拟客户端的Qemu进程,之后为kswapd0和kswapd1,CPU使用率为100%。

内存使用情况

  • 内存总计378.48 GB
  • 内存使用330.82 GB
  • RAM免费47.66 GB
  • SWAP共计500.24 MB
  • SWAP使用497.13 MB
  • SWAP free 3192 kB

加上10-20 GB的RAM缓冲区。

所以,从内存使用的angular度来看,不应该有任何问题。 但是kswapd进程的繁重工作表明内存不足,在这个方向上也是完全交换点(当我把交换和交换,它会在几分钟内被填满)。 有一段时间,OOM杀手杀死了一些过程:

Nov 20 12:42:42 wv2-f302 kernel: active_anon:79945387 inactive_anon:3660742 isolated_anon:0 Nov 20 12:42:42 wv2-f302 kernel: active_file:252 inactive_file:0 isolated_file:0 Nov 20 12:42:42 wv2-f302 kernel: unevictable:0 dirty:2 writeback:0 unstable:0 Nov 20 12:42:42 wv2-f302 kernel: free:12513746 slab_reclaimable:5001 slab_unreclaimable:1759785 Nov 20 12:42:42 wv2-f302 kernel: mapped:213 shmem:41 pagetables:188243 bounce:0 Nov 20 12:42:42 wv2-f302 kernel: Node 0 DMA free:15728kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15332kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes Nov 20 12:42:42 wv2-f302 kernel: lowmem_reserve[]: 0 2965 193855 193855 Nov 20 12:42:42 wv2-f302 kernel: Node 0 DMA32 free:431968kB min:688kB low:860kB high:1032kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3037072kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes Nov 20 12:42:42 wv2-f302 kernel: lowmem_reserve[]: 0 0 190890 190890 Nov 20 12:42:42 wv2-f302 kernel: Node 0 Normal free:6593828kB min:44356kB low:55444kB high:66532kB active_anon:178841380kB inactive_anon:7783292kB active_file:540kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:195471360kB mlocked:0kB dirty:8kB writeback:0kB mapped:312kB shmem:48kB slab_reclaimable:11136kB slab_unreclaimable:1959664kB kernel_stack:5104kB pagetables:397332kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Nov 20 12:42:42 wv2-f302 kernel: lowmem_reserve[]: 0 0 0 0 Nov 20 12:42:42 wv2-f302 kernel: Node 1 Normal free:43013460kB min:45060kB low:56324kB high:67588kB active_anon:140940168kB inactive_anon:6859676kB active_file:468kB inactive_file:56kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:198574076kB mlocked:0kB dirty:0kB writeback:0kB mapped:540kB shmem:116kB slab_reclaimable:8868kB slab_unreclaimable:5079476kB kernel_stack:2856kB pagetables:355640kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Nov 20 12:42:42 wv2-f302 kernel: lowmem_reserve[]: 0 0 0 0 Nov 20 12:42:42 wv2-f302 kernel: Node 0 DMA: 2*4kB 1*8kB 2*16kB 2*32kB 2*64kB 1*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15728kB Nov 20 12:42:42 wv2-f302 kernel: Node 0 DMA32: 10*4kB 11*8kB 12*16kB 13*32kB 12*64kB 5*128kB 7*256kB 10*512kB 9*1024kB 6*2048kB 98*4096kB = 431968kB Nov 20 12:42:42 wv2-f302 kernel: Node 0 Normal: 1648026*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6592104kB Nov 20 12:42:42 wv2-f302 kernel: Node 1 Normal: 8390977*4kB 1181188*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 43013412kB Nov 20 12:42:42 wv2-f302 kernel: 49429 total pagecache pages Nov 20 12:42:42 wv2-f302 kernel: 48929 pages in swap cache Nov 20 12:42:42 wv2-f302 kernel: Swap cache stats: add 2688331, delete 2639402, find 16219898/16530111 Nov 20 12:42:42 wv2-f302 kernel: Free swap = 3264kB Nov 20 12:42:42 wv2-f302 kernel: Total swap = 512248kB Nov 20 12:42:44 wv2-f302 kernel: 100663294 pages RAM Nov 20 12:42:44 wv2-f302 kernel: 1446311 pages reserved Nov 20 12:42:44 wv2-f302 kernel: 10374115 pages shared Nov 20 12:42:44 wv2-f302 kernel: 84534113 pages non-shared Oct 27 14:24:43 wv2-f302 kernel: [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name Oct 27 14:24:43 wv2-f302 kernel: [ 3878] 0 3878 32042399 31569413 10 0 0 qemu_wl52 Oct 27 14:24:43 wv2-f302 kernel: [ 4321] 0 4321 32092081 31599762 20 0 0 qemu_wl51 Oct 27 14:24:43 wv2-f302 kernel: [ 4394] 0 4394 32106979 31575717 15 0 0 qemu_wl50 ... Oct 27 14:24:43 wv2-f302 kernel: Out of memory: Kill process 3878 (qemu_wl52) score 318 or sacrifice child Oct 27 14:24:43 wv2-f302 kernel: Killed process 3878, UID 0, (qemu_wl52) total-vm:128169596kB, anon-rss:126277476kB, file-rss:176kB 

完整转储: http : //evilcigi.eu/msg/msg.txt

然后,我开始杀人的客人,从那一刻起,一切都好,连续几天..与之前的问题一样的内存使用情况:

  • 内存总计378.48 GB
  • 内存使用336.15 GB
  • 无RAM 42.33 GB
  • SWAP共计500.24 MB
  • SWAP使用了344.55 MB
  • SWAP免费155.69 MB

服务器是否有可能严重计数内存? 还是有什么我失踪?

有一件事让我想起,主机把所有空闲的内存放入缓冲区和caching,然后遭受内存不足(调用OOM杀手)? 但是,我认为,不应该发生,对吧? 而且,这并不能解释杀人之前的行为。

先谢谢你。


所以今天再次出现问题,这里是/ proc / meminfo的内容:

 MemTotal: 396867932 kB MemFree: 9720268 kB Buffers: 53354000 kB Cached: 22196 kB SwapCached: 343964 kB Active: 331872796 kB Inactive: 41283992 kB Active(anon): 305458432 kB Inactive(anon): 14322324 kB Active(file): 26414364 kB Inactive(file): 26961668 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 512248 kB SwapFree: 0 kB Dirty: 48 kB Writeback: 0 kB AnonPages: 319438656 kB Mapped: 8536 kB Shmem: 164 kB Slab: 9052784 kB SReclaimable: 2014752 kB SUnreclaim: 7038032 kB KernelStack: 8064 kB PageTables: 650892 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 198946212 kB Committed_AS: 383832752 kB VmallocTotal: 34359738367 kB VmallocUsed: 1824832 kB VmallocChunk: 34157271228 kB HardwareCorrupted: 0 kB AnonHugePages: 31502336 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 7852 kB DirectMap2M: 3102720 kB DirectMap1G: 399507456 kB 

似乎所有的“空闲”内存都花在缓冲区中。


在@Matthew Ife关于内存碎片的提示之后,我压缩了内存,并在主机上删除了caching(释放缓冲区中的60 GB)的caching,命令如下:

 echo 3 > /proc/sys/vm/drop_caches echo 1 >/proc/sys/vm/compact_memory 

下面是内存碎片现在的样子:

 # cat /proc/buddyinfo Node 0, zone DMA 2 1 2 2 2 1 0 0 1 1 3 Node 0, zone DMA32 12 12 13 16 10 5 7 10 9 6 98 Node 0, zone Normal 2398537 469407 144288 97224 58276 24155 8153 3141 1299 451 75 Node 1, zone Normal 9182926 2727543 648104 81843 7915 1267 244 67 3 1 0 

更新2014/11/25 – 服务器再次超载:

 # cat /proc/buddyinfo Node 0, zone DMA 2 1 2 2 2 1 0 0 1 1 3 Node 0, zone DMA32 12 12 13 16 10 5 7 10 9 6 98 Node 0, zone Normal 4374385 85408 0 0 0 0 0 0 0 0 0 Node 1, zone Normal 1830850 261703 460 14 0 0 0 0 0 0 0 # cat /proc/meminfo MemTotal: 396867932 kB MemFree: 28038892 kB Buffers: 49126656 kB Cached: 19088 kB SwapCached: 303624 kB Active: 305426204 kB Inactive: 49729776 kB Active(anon): 292040988 kB Inactive(anon): 13969376 kB Active(file): 13385216 kB Inactive(file): 35760400 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 512248 kB SwapFree: 20 kB Dirty: 28 kB Writeback: 0 kB AnonPages: 305706632 kB Mapped: 9324 kB Shmem: 124 kB Slab: 8616228 kB SReclaimable: 1580736 kB SUnreclaim: 7035492 kB KernelStack: 8200 kB PageTables: 702268 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 198946212 kB Committed_AS: 384014048 kB VmallocTotal: 34359738367 kB VmallocUsed: 1824832 kB VmallocChunk: 34157271228 kB HardwareCorrupted: 0 kB AnonHugePages: 31670272 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 7852 kB DirectMap2M: 3102720 kB DirectMap1G: 399507456 kB 

并在系统日志中的一些页面分配失败:

 Nov 25 09:14:07 wv2-f302 kernel: qemu_wl50: page allocation failure. order:4, mode:0x20 Nov 25 09:14:07 wv2-f302 kernel: Pid: 4444, comm: qemu_wl50 Not tainted 2.6.32-431.17.1.el6.x86_64 #1 Nov 25 09:14:07 wv2-f302 kernel: Call Trace: Nov 25 09:14:07 wv2-f302 kernel: <IRQ> [<ffffffff8112f64a>] ? __alloc_pages_nodemask+0x74a/0x8d0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8116e082>] ? kmem_getpages+0x62/0x170 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8116ec9a>] ? fallback_alloc+0x1ba/0x270 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8116ea19>] ? ____cache_alloc_node+0x99/0x160 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8116fbe0>] ? kmem_cache_alloc_node_trace+0x90/0x200 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8116fdfd>] ? __kmalloc_node+0x4d/0x60 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8144ff5a>] ? __alloc_skb+0x7a/0x180 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff81451070>] ? skb_copy+0x40/0xb0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa013a55c>] ? tg3_start_xmit+0xa8c/0xd80 [tg3] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff814603e4>] ? dev_hard_start_xmit+0x224/0x480 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8147be6a>] ? sch_direct_xmit+0x15a/0x1c0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff814608e8>] ? dev_queue_xmit+0x228/0x320 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa02c8898>] ? br_dev_queue_push_xmit+0x88/0xc0 [bridge] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa02c8928>] ? br_forward_finish+0x58/0x60 [bridge] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa02c8ae8>] ? __br_deliver+0x78/0x110 [bridge] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa02c8bb5>] ? br_deliver+0x35/0x40 [bridge] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa02c78f4>] ? br_dev_xmit+0x114/0x140 [bridge] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff814603e4>] ? dev_hard_start_xmit+0x224/0x480 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8146087d>] ? dev_queue_xmit+0x1bd/0x320 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff81466785>] ? neigh_resolve_output+0x105/0x2d0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8149a2f8>] ? ip_finish_output+0x148/0x310 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8149a578>] ? ip_output+0xb8/0xc0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8149983f>] ? __ip_local_out+0x9f/0xb0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff81499875>] ? ip_local_out+0x25/0x30 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff81499d50>] ? ip_queue_xmit+0x190/0x420 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff814af06e>] ? tcp_transmit_skb+0x40e/0x7b0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff814b15b0>] ? tcp_write_xmit+0x230/0xa90 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff814b2130>] ? __tcp_push_pending_frames+0x30/0xe0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff814a9893>] ? tcp_data_snd_check+0x33/0x100 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff814ad491>] ? tcp_rcv_established+0x381/0x7f0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff814b5873>] ? tcp_v4_do_rcv+0x2e3/0x490 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa02b1557>] ? ipv4_confirm+0x87/0x1d0 [nf_conntrack_ipv4] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa0124441>] ? nf_nat_fn+0x91/0x260 [iptable_nat] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff814b717a>] ? tcp_v4_rcv+0x51a/0x900 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff81494300>] ? ip_local_deliver_finish+0x0/0x2d0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff814943dd>] ? ip_local_deliver_finish+0xdd/0x2d0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff81494668>] ? ip_local_deliver+0x98/0xa0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff81493b2d>] ? ip_rcv_finish+0x12d/0x440 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff814940b5>] ? ip_rcv+0x275/0x350 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff81489509>] ? nf_iterate+0x69/0xb0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8145b5db>] ? __netif_receive_skb+0x4ab/0x750 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8145f1f0>] ? netif_receive_skb+0x0/0x60 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8145f248>] ? netif_receive_skb+0x58/0x60 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa02c9af8>] ? br_handle_frame_finish+0x1e8/0x2a0 [bridge] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa02c9d5a>] ? br_handle_frame+0x1aa/0x250 [bridge] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8145b659>] ? __netif_receive_skb+0x529/0x750 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8145f248>] ? netif_receive_skb+0x58/0x60 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8145f350>] ? napi_skb_finish+0x50/0x70 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff81460ab9>] ? napi_gro_receive+0x39/0x50 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa0136b54>] ? tg3_poll_work+0xc24/0x1020 [tg3] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa0136f9c>] ? tg3_poll_msix+0x4c/0x150 [tg3] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff81460bd3>] ? net_rx_action+0x103/0x2f0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff810a6da9>] ? ktime_get+0x69/0xf0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8107a551>] ? __do_softirq+0xc1/0x1e0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff810e6b20>] ? handle_IRQ_event+0x60/0x170 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8100fa75>] ? do_softirq+0x65/0xa0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8107a405>] ? irq_exit+0x85/0x90 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff815312c5>] ? do_IRQ+0x75/0xf0 Nov 25 09:14:07 wv2-f302 kernel: <EOI> [<ffffffffa018e271>] ? kvm_arch_vcpu_ioctl_run+0x4c1/0x10b0 [kvm] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa018e25f>] ? kvm_arch_vcpu_ioctl_run+0x4af/0x10b0 [kvm] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff810aee2e>] ? futex_wake+0x10e/0x120 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffffa0175b04>] ? kvm_vcpu_ioctl+0x434/0x580 [kvm] Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8100b9ce>] ? common_interrupt+0xe/0x13 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8119d802>] ? vfs_ioctl+0x22/0xa0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8119dcca>] ? do_vfs_ioctl+0x3aa/0x580 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff810b186b>] ? sys_futex+0x7b/0x170 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8119df21>] ? sys_ioctl+0x81/0xa0 Nov 25 09:14:07 wv2-f302 kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b 

编辑:通过减less分配给客人的内存解决了问题。 现在有3个客户,每个80 GB RAM,为主机系统留下大约150 GB RAM:

 # free -h total used free shared buff/cache available Mem: 377G 243G 29G 1,9G 104G 132G 

这里有很多可用的内存,但是这些区域是完全分散的:

 Node 0 Normal: 1648026*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6592104kB Node 1 Normal: 8390977*4kB 1181188*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 

很less有非零的订单页面,剩下的一个区域都没有。

我不能保证什么,但你可能想尝试closuresksmd和重新紧凑的内存。 压缩只会在高阶页面分配时被自动调用,并且不会调用oom-killer,所以我假设系统试图从命令2或3分配内存并卡住。

为了压缩内存,运行echo 1 >/proc/sys/vm/compact_memory

在这个问题上只有这么多,但我怀疑ksmd是通过扫描两个虚拟机中重复的页面并将其全部交换来造成碎片。

如果使用qemu和-snapshot来避免磁盘映像被修改,那么qemu可以像这样保持磁盘修改到一个临时文件中,这个临时文件可能会填满像/ dev / shm这样的分区或内存盘。

有关于此的更多信息,请访问http://blog.nielshorn.net/2011/03/qemu-snapshots-temporary-files-where-they-go-and-how-to-change-that/

简而言之,上面的页面表示临时文件在打开后是未链接的,因此即使占用空间也不能在文件系统中看到。 为了避免这些问题,你应该确保在启动qemu之前,环境variablesTMPDIR指向一个有足够空间的目录。

通过减less为客人分配的内存解决了问题。 现在有3个客户,每个80 GB RAM,为主机系统留下大约150 GB RAM:

 # free -h total used free shared buff/cache available Mem: 377G 243G 29G 1,9G 104G 132G 

感觉像一个大的记忆浪费,但事情现在是稳定的。

@马修的答案应该被标记为解决这个问题。 /proc/buddyinfo清楚地显示了碎片(由于ksmd或其他行为)。 内存压缩是一个有效的解决scheme。

我们只是在我们的服务器上遇到同样的问题:

 # cat /proc/buddyinfo Node 0, zone DMA 1 0 1 0 0 1 0 0 0 1 3 Node 0, zone DMA32 4941 14025 10661 1462 1715 154 1 0 0 0 0 Node 0, zone Normal 420283 217678 3852 3 1 0 1 1 1 0 0 Node 1, zone Normal 1178429 294431 21420 340 7 2 1 2 0 0 0 

这清楚地显示了碎片,因为大多数内存在许多小块内存中被分割(左边的大数字,右边的零)。

现在压实解决了这个问题:

 # echo 1 >/proc/sys/vm/compact_memory # cat /proc/buddyinfo Node 0, zone DMA 1 0 1 0 0 1 0 0 0 1 3 Node 0, zone DMA32 485 1746 8588 3311 2076 505 98 19 3 0 0 Node 0, zone Normal 83764 22474 8597 3130 1971 1421 1090 808 556 358 95 Node 1, zone Normal 51928 36053 36093 29024 21498 13148 5719 1405 151 8 0