ANBOB™

专业的Oracle及国产数据库选型咨询、故障诊断、性能优化、远程维保、异常恢复、安装部署、升级迁移等服务, QQ:85304522 微信/Tel:(+86)134-365-60330

首页 » 系统相关 » 故障诊断 RHEL7 Slab SUnreclaim (kmalloc-8192) 内存占用高

故障诊断 RHEL7 Slab SUnreclaim (kmalloc-8192) 内存占用高

2024/08/06
系统相关
279 views
故障诊断 RHEL7 Slab SUnreclaim (kmalloc-8192) 内存占用高已关闭评论

最近遇到两起运行在 Linux 7 上的数据库主机问题。由于操作系统内核的内存使用率高，导致 Oracle RAC 的性能受损或无法使用。内存主要被 Slab 的 SUnreclaim 区域占用。这些案例有一个共同特点：都使用了分布式文件存储系统。这次的情况是生产环境中有 750G 的内存，而 SLAB 使用了接近 200G 的内存，且主要是由 SUnreclaim 区域占用的。特此记录这个案例。

什么是Slab

在Linux操作系统中，”slab” 是一种内存分配机制，属于内核的内存管理子系统。它专门用于管理小块内存对象的分配和释放。slab分配器（Slab Allocator） 通过将内存分成多个“缓存区（slab caches）”，每个缓存区包含多个相同大小的对象，这些对象可以快速分配和释放。这种方法有助于减少内存碎片，提高分配和释放小对象的效率，同时保持系统的内存利用率。SLAB分为SReclaimable可回收和SUnreclaim不可回收.

Slab的两个主要作用：

Slab对小对象进行分配，不用为每个小对象分配一个页，节省了空间。
内核中一些小对象创建析构很频繁，Slab对这些小对象做缓存，可以重复利用一些相同的对象，减少内存分配次数。

问题现象

操作系统内存使用率超过90%，主要是有SLAB的SUnreclaim使用.

oracle@anbob:/home/oracle> cat /etc/os-release 
NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.6"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.6 (Maipo)"

oracle@anbob:/home/oracle> free -g
              total        used        free      shared  buff/cache   available
Mem:            753         493          36          14         223          30
Swap:            19           6          13

oracle@anbob:/home/oracle> cat /proc/meminfo
MemTotal: 790552132 kB
MemFree: 38262416 kB
MemAvailable: 32045452 kB
Buffers: 177444 kB
Cached: 17232144 kB
SwapCached: 234392 kB
Active: 69777460 kB
Inactive: 15421664 kB
Active(anon): 69205652 kB
Inactive(anon): 14676100 kB
Active(file): 571808 kB
Inactive(file): 745564 kB
Unevictable: 4246792 kB
Mlocked: 4246792 kB
SwapTotal: 20971516 kB
SwapFree: 14217060 kB
Dirty: 2092 kB
Writeback: 0 kB
AnonPages: 72415332 kB
Mapped: 4372664 kB
Shmem: 15343984 kB
Slab: 216883280 kB
SReclaimable: 806496 kB
SUnreclaim: 216076784 kB
KernelStack: 133184 kB
PageTables: 2595304 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 218627868 kB
Committed_AS: 119295936 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 2417732 kB
VmallocChunk: 34357109612 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
CmaTotal: 0 kB
CmaFree: 0 kB
HugePages_Total: 192988
HugePages_Free: 39806
HugePages_Rsvd: 419
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 49613824 kB
DirectMap2M: 344141824 kB
DirectMap1G: 412090368 kB

Note:
buff/cache占用200G+, 主要是Slab占用, 其中又主要是SUnreclaim占用200G+。

Matching output of free -k to `/proc/meminfo`

Red Hat Enterprise Linux 7.1 or later

`free output`	coresponding `/proc/meminfo` fields
Mem: total	`MemTotal`
`Mem: used`	`MemTotal - MemFree - Buffers - Cached - Slab`
`Mem: free`	`MemFree`
`Mem: shared`	`Shmem`
`Mem: buff/cache`	`Buffers + Cached + Slab`
`Mem:available`	`MemAvailable`
`Swap: total`	`SwapTotal`
`Swap: used`	`SwapTotal - SwapFree`
`Swap: free`	`SwapFree`

RHEL 6, 7, 8 & 9.

Active(anon): Anonymous memory that has been used more recently and usually not swapped out
Inactive(anon): Anonymous memory that has not been used recently and can be swapped out
Active(file): Pagecache memory that has been used more recently and usually not reclaimed until needed
Inactive(file): Pagecache memory that can be reclaimed without huge performance impact
Unevictable: Unevictable pages can’t be swapped out for a variety of reasons
Mlocked: Pages locked to memory using the mlock() system call. Mlocked pages are also Unevictable.
SwapTotal: Total swap space available
SwapFree: The remaining swap space available
Dirty: Memory waiting to be written back to disk
Writeback: Memory which is actively being written back to disk
AnonPages: Non-file backed pages mapped into userspace page tables
Mapped: Files which have been mmaped, such as libraries
Slab: In-kernel data structures cache
PageTables: Amount of memory dedicated to the lowest level of page tables. This can increase to a high value if a lot of processes are attached to the same shared memory segment.
Shmem: Total used shared memory (shared between several processes, thus including RAM disks, SYS-V-IPC and BSD like SHMEM)
SReclaimable: The part of the Slab that might be reclaimed (such as caches)
SUnreclaim: The part of the Slab that can’t be reclaimed under memory pressure
KernelStack: The memory the kernel stack uses. This is not reclaimable.
WritebackTmp: Memory used by FUSE for temporary writeback buffers
HardwareCorrupted: The amount of RAM the kernel identified as corrupted / not working
AnonHugePages: Non-file backed huge pages mapped into userspace page tables
HugePages_Surp: The number of hugepages in the pool above the value in vm.nr_hugepages. The maximum number of surplus hugepages is controlled by vm.nr_overcommit_hugepages.
DirectMap4k: The amount of memory being mapped into the kernel space with 4k size pages.
DirectMap2M: The amount of memory being mapped into the kernel space with 2MB size pages.
DirectMap1G. The amount of memory being mapped into the kernel space with 1GB size pages.

More Interpreting /proc/meminfo and free output for Red Hat Enterprise Linux

/proc/slabinfo文件信息

在Slab中，可分配内存块称为对象，下图中kmalloc-8表示每个对象占用8Bit大小的普通Slab，同理kmalloc-16中每个对象占用16B，依次类推，找出Slab中占用量较大的对象是哪些？

每种对象占用总内存量 = num_objs*objsize

root@anbob:/root> cat /proc/slabinfo
slabinfo - version: 2.1
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
inode_cache       107092 107253    592   55    8 : tunables    0    0    0 : slabdata   1975   1975      0
dentry            424294 432054    192   42    2 : tunables    0    0    0 : slabdata  10287  10287      0
...
kmalloc-8192      29176803 29176803   8192    4    8 : tunables    0    0    0 : slabdata 7320546 7320546      0  --8192*29176803/1024/1024/1024 = 222G
kmalloc-4096        8205   9064   4096    8    8 : tunables    0    0    0 : slabdata   1133   1133      0
kmalloc-2048       35899  36690   2048   16    8 : tunables    0    0    0 : slabdata   2371   2371      0
kmalloc-1024       67641  69952   1024   32    8 : tunables    0    0    0 : slabdata   2186   2186      0
kmalloc-512       689591 709656    512   64    8 : tunables    0    0    0 : slabdata  11140  11140      0
kmalloc-256       1137831 1324864    256   64    4 : tunables    0    0    0 : slabdata  20701  20701      0
kmalloc-192       763850 816186    192   42    2 : tunables    0    0    0 : slabdata  19433  19433      0
kmalloc-128       485959 499008    128   64    2 : tunables    0    0    0 : slabdata   7797   7797      0
...

另外也可以使用slabtop 查看TOP

slabtop --sort c --once | head -n12

/bin/slabtop --once

可以使用crash工具进行静态分析，也可以使用perf工具进行动态分析，排查造成slab内存泄露的原因。

 
crash> kem -S kmalloc-8192|tail -n 10
crash> rd [memory address] 512 -S
-- or --

perf record -a -e kmem:kmalloc --filter 'bytes_alloc == 8192' -e kmem:kfree --filter ' ptr != 0' sleep 200
perf script > testperf.txt
cat testperf.txt

解决方法

当SUnreclaim内存超过系统总内存的10%时，可能存在slab内存泄漏。slab内存是内核组件（或驱动）通过kmalloc类接口向buddy系统申请的内存，而内核组件（或驱动）没有正常释放。实例一旦发生slab内存泄漏，无法通过kill进程的方式回收内存，只能重启实例。slab内存泄漏会导致实例上可供业务操作使用的内存减少，内存碎片化，还可能触发系统OOM Killer，造成系统性能抖动。

在Oracle DOC High Slab SUnreclaim (Doc ID 2913967.1) 记录在 Linux OS – Version Oracle Linux 7.9 and later 存在一个问题。

Cause

The issue is reported in the internal Bug 34670124. It is caused by the *ksplice* patches below:
(1) CVE-2021-4197: Privilege escalation in Control Groups.
(2) Allow to preserve anonymous memory through exec syscalls.

Solution
Rebooting the server as a workaround and the issue is fxied in V4.14.35-2047.516.0 or later.

目前没有有效的解决办法（比如dentry对象与kmalloc-xxx对象）, 建议监控Slab内存的使用，有计划重启操作系统，之前同事在一个客户使用sync&slabinfo -s命令可以在线的释放。或通过crash和perf等工具确定了内存泄露的函数调用路径或者影响的内核数据结构后，建议在内核开发者或专业运维人员指导下确定内存泄露的具体源头，然后解决内存泄露问题。

打赏

Slab ，SUnreclaim ，kmalloc-8192

对不起，这篇文章暂时关闭评论。

上一篇： How to analyze enq: TM – Contention with LogMiner？

下一篇： Oracle 、Oceanbase、GoldenDB数据库比较系列(二十五)：sql profile/ outline 影响范围（中）

ANBOB™

故障诊断 RHEL7 Slab SUnreclaim (kmalloc-8192) 内存占用高

什么是Slab

Slab的两个主要作用：

问题现象

Matching output of free -k to `/proc/meminfo`

RHEL 6, 7, 8 & 9.

/proc/slabinfo文件信息

解决方法

对不起，这篇文章暂时关闭评论。

最新文章

标签云集

文章索引

MySql Link

ORACLE Link

Others Link

国内好友

管理功能

微信公众号/Wechat

ANBOB™

故障诊断 RHEL7 Slab SUnreclaim (kmalloc-8192) 内存占用高

什么是Slab

Slab的两个主要作用：

问题现象

Matching output of free -k to /proc/meminfo

RHEL 6, 7, 8 & 9.

/proc/slabinfo文件信息

解决方法

对不起，这篇文章暂时关闭评论。

最新文章

标签云集

文章索引

MySql Link

ORACLE Link

Others Link

国内好友

管理功能

微信公众号/Wechat

Matching output of free -k to `/proc/meminfo`