在EC2 ebs支持的xlarge ubuntu实例上,oom-killer正在被调用.
从下面的/ var / log / syslog输出看来,ZONE_NORMAL的内存不足:
从下面的/ var / log / syslog输出看来,ZONE_NORMAL的内存不足:
Node 0 Normal free:11344kB min:11556kB low:14444kB high:17332kB active_anon:10936284kB inactive_anon:144kB active_file:688kB inactive_file:740kB
但是为什么ZONE_NORMAL只分配了11MB的15GB总RAM?还是有其他原因导致内存不足?
机器上的RAM(xlarge实例)为15GB.以下日志中RSS列的总和为3.7GB,total_vm的列为11.4GB.
Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456146] node invoked oom-killer: gfp_mask=0x84d0,order=0,oom_adj=0,oom_score_adj=0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456153] node cpuset=/ mems_allowed=0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456157] Pid: 639,comm: node Not tainted 2.6.38-1-virtual #28-Ubuntu Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456160] Call Trace: Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456170] [<ffffffff810b259d>] ? cpuset_print_task_mems_allowed+0x9d/0xb0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456189] [<ffffffff81106a01>] ? dump_header+0x91/0x1e0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456196] [<ffffffff81007e7f>] ? xen_restore_fl_direct_end+0x0/0x1 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456203] [<ffffffff815c92de>] ? _raw_spin_unlock_irqrestore+0x1e/0x30 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456208] [<ffffffff8110705d>] ? oom_kill_process+0x8d/0x180 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456213] [<ffffffff81107492>] ? out_of_memory+0x102/0x240 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456218] [<ffffffff8110c209>] ? __alloc_pages_nodemask+0x709/0x7f0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456224] [<ffffffff81105c78>] ? filemap_fault+0x398/0x490 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456230] [<ffffffff811414bd>] ? alloc_pages_current+0x9d/0x110 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456235] [<ffffffff8104338b>] ? pte_alloc_one+0x1b/0x50 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456239] [<ffffffff8100530e>] ? xen_pud_val+0xe/0x10 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456245] [<ffffffff811255c5>] ? __pte_alloc+0x35/0x100 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456250] [<ffffffff81129559>] ? handle_mm_fault+0x129/0x250 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456257] [<ffffffff815ccf82>] ? do_page_fault+0x1a2/0x540 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456262] [<ffffffff8109aacb>] ? do_futex+0xbb/0x210 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456266] [<ffffffff8109ac9b>] ? sys_futex+0x7b/0x180 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456274] [<ffffffff814c2975>] ? sys_recvmsg+0x75/0x90 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456278] [<ffffffff815c9a15>] ? page_fault+0x25/0x30 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456282] Mem-Info: Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456284] Node 0 DMA per-cpu: Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456288] cpu 0: hi: 0,btch: 1 usd: 0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456290] cpu 1: hi: 0,btch: 1 usd: 0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456293] cpu 2: hi: 0,btch: 1 usd: 0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456295] cpu 3: hi: 0,btch: 1 usd: 0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456299] Node 0 DMA32 per-cpu: Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456303] cpu 0: hi: 186,btch: 31 usd: 0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456307] cpu 1: hi: 186,btch: 31 usd: 81 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456310] cpu 2: hi: 186,btch: 31 usd: 0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456314] cpu 3: hi: 186,btch: 31 usd: 0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456317] Node 0 Normal per-cpu: Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456321] cpu 0: hi: 186,btch: 31 usd: 0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456323] cpu 1: hi: 186,btch: 31 usd: 175 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456326] cpu 2: hi: 186,btch: 31 usd: 0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456329] cpu 3: hi: 186,btch: 31 usd: 0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456334] active_anon:3726183 inactive_anon:44 isolated_anon:0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456335] active_file:214 inactive_file:237 isolated_file:64 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456336] unevictable:0 dirty:1 writeback:0 unstable:0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456337] free:18955 slab_reclaimable:2811 slab_unreclaimable:7277 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456338] mapped:256 shmem:101 pagetables:12634 bounce:0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456341] Node 0 DMA free:15880kB min:12kB low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated( anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456353] lowmem_reserve[]: 0 4024 15142 15142 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456360] Node 0 DMA32 free:48596kB min:4180kB low:5224kB high:6268kB active_anon:3968448kB inactive_anon:32kB active_file:168kB inactive_file:208kB unevic table:0kB isolated(anon):0kB isolated(file):0kB present:4120800kB mlocked:0kB dirty:0kB writeback:0kB mapped:328kB shmem:80kB slab_reclaimable:2208kB slab_unreclaimable:8604kB kernel_stack:48kB pagetabl es:9724kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:601 all_unreclaimable? yes Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456371] lowmem_reserve[]: 0 0 11117 11117 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456377] Node 0 Normal free:11344kB min:11556kB low:14444kB high:17332kB active_anon:10936284kB inactive_anon:144kB active_file:688kB inactive_file:740kB unevictable:0kB isolated(anon):0kB isolated(file):256kB present:11384720kB mlocked:0kB dirty:4kB writeback:0kB mapped:696kB shmem:324kB slab_reclaimable:9036kB slab_unreclaimable:20504kB kernel_stack:20 16kB pagetables:40812kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2190 all_unreclaimable? yes Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456388] lowmem_reserve[]: 0 0 0 0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456393] Node 0 DMA: 0*4kB 1*8kB 0*16kB 0*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15880kB Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456409] Node 0 DMA32: 715*4kB 525*8kB 343*16kB 245*32kB 147*64kB 50*128kB 16*256kB 5*512kB 2*1024kB 0*2048kB 1*4096kB = 48996kB Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456423] Node 0 Normal: 1919*4kB 5*8kB 2*16kB 2*32kB 1*64kB 1*128kB 2*256kB 2*512kB 0*1024kB 1*2048kB 0*4096kB = 11588kB Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456438] 768 total pagecache pages Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456440] 0 pages in swap cache Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456443] Swap cache stats: add 0,delete 0,find 0/0 Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456445] Free swap = 0kB Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.456447] Total swap = 0kB Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495285] 3934192 pages RAM Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495290] 91275 pages reserved Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495291] 1661 pages shared Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495293] 3822220 pages non-shared Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495296] [ pid ] uid tgid total_vm RSS cpu oom_adj oom_score_adj name Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495310] [ 228] 0 228 4297 71 2 0 0 upstart-udev-br Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495315] [ 242] 0 242 5353 123 2 -17 -1000 udevd Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495319] [ 319] 0 319 5352 126 2 -17 -1000 udevd Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495323] [ 320] 0 320 5352 126 0 -17 -1000 udevd Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495327] [ 392] 0 392 1771 124 2 0 0 dhclient Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495332] [ 467] 0 467 12366 140 3 -17 -1000 sshd Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495336] [ 474] 101 474 47476 160 2 0 0 rsyslogd Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495340] [ 502] 102 502 5927 108 2 0 0 dbus-daemon Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495344] [ 534] 0 534 1035 27 3 0 0 getty Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495348] [ 539] 0 539 1035 27 0 0 0 getty Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495352] [ 543] 0 543 13589 2510 1 0 0 supervisord Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495356] [ 544] 0 544 1035 25 2 0 0 getty Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495360] [ 545] 0 545 1035 27 2 0 0 getty Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495364] [ 547] 0 547 1035 27 2 0 0 getty Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495368] [ 552] 0 552 5302 65 1 0 0 cron Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495372] [ 553] 0 553 4753 44 1 0 0 atd Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495376] [ 565] 0 565 3946 46 2 0 0 irqbalance Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495380] [ 571] 106 571 35760 3859 1 0 0 memcached Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495384] [ 637] 1000 637 222128 36427 1 0 0 node Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495388] [ 639] 1000 639 189634 4396 1 0 0 node Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495392] [ 696] 0 696 1035 27 3 0 0 getty Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495396] [ 755] 0 755 31312 372 2 0 0 console-kit-dae Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495401] [ 2404] 0 2404 35058 3644 1 0 0 carbon-cache.py Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495405] [ 2601] 1000 2601 5982 199 1 0 0 screen Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495409] [ 2602] 1000 2602 5576 766 0 0 0 bash Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495413] [ 2661] 0 2661 6801 67 1 0 0 sudo Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495417] [ 2663] 0 2663 7133 851 1 0 0 run-graphite-de Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495421] [ 2668] 0 2668 1056 25 1 0 0 sh Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495425] [ 2669] 0 2669 12906 3942 1 0 0 django-admin.py Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495429] [ 2670] 0 2670 50313 11546 1 0 0 python Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495433] [ 2785] 0 2785 15953 489 1 0 0 Nginx Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495437] [ 3123] 1000 3123 1184911 266122 1 0 0 java Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495443] [18135] 0 18135 5884532 9128 1 0 0 mongod Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495449] [12574] 1000 12574 966781 932357 2 0 0 python Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495455] [30139] 1000 30139 759010 741642 1 0 0 python Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495461] [14186] 33 14186 19905 4657 1 0 0 Nginx Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495465] [14187] 33 14187 19726 4527 1 0 0 Nginx Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495470] [14190] 1000 14190 23767 8459 1 0 0 python Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495474] [14201] 1000 14201 1706261 1687237 1 0 0 python Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495478] [24553] 0 24553 19831 200 1 0 0 sshd Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495482] [24584] 1000 24584 19831 200 1 0 0 sshd Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495486] [24585] 1000 24585 6464 1655 1 0 0 bash Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495490] Out of memory: Kill process 14201 (python) score 439 or sacrifice child Dec 11 08:53:53 ip-10-60-61-71 kernel: [19427969.495509] Killed process 14201 (python) total-vm:6825044kB,anon-RSS:6748848kB,file-RSS:100kB
谢谢你的帮助!
解决方法
对我来说,看起来非常合法的OOM:
>您有一个NUMA节点(即节点0),它有三个区域:
> DMA有15880kB免费
> DMA32,免费48996kB
>正常,有11588kB免费
>几乎所有的内存3968448kB 10936284kB都在active_anon LRU列表中.>表格中的内存大小以页面为单位(在您的架构上最有可能是4kb),因此它们总计高达14.8Gb,而不是3.7.