Overview
A large Slab cache can slow a server down, especially under heavy I/O. But "Slab is huge" has two very different root causes, and the correct fix depends on which one you have, so always triage first.
The Slab total is the sum of two parts in /proc/meminfo:
# grep -E 'Slab|SReclaimable|SUnreclaim' /proc/meminfo Slab: 120366508 kB SReclaimable: 117102628 kB # dentry/inode cache — reclaimable SUnreclaim: 3263880 kB # kernel objects — NOT reclaimable
SReclaimabledominates → benign filesystem caching (dentry/inode). The kernel frees it automatically under memory pressure; tune only if it is genuinely hurting I/O. → Path A.SUnreclaimdominates / keeps climbing → a kernel-memory (kmem) accounting problem.vfs_cachetuning will not help. → Path B.
To make any sysctl change persistent across reboots, add it to /etc/sysctl.conf (or a file under /etc/sysctl.d/).
Path A - High SReclaimable (dentry/inode cache)
Typical trigger: a large, nearly-full partition (e.g. /home, 1 TB+) plus software that walks every file - backups, malware scans, find, indexing. Each stat() pulls dentry/inode entries into the cache.
1. Temporary flush (immediate relief, safe):
# sync; sysctl -w vm.drop_caches=3
This drops pagecache + dentries + inodes once. It does not change behavior going forward - it is a one-shot.
2. Make the kernel reclaim dentry/inode cache more aggressively (persistent, all platforms):
# sysctl -w vm.vfs_cache_pressure=200 # default 100; start at 200
Higher vfs_cache_pressure makes the kernel shrink the dentry/inode cache more eagerly. Start at 200; raise further (up to 500 or 1000) only if 200 is not enough. This is upstream Linux behavior and works on every kernel/distro (CL6-CL10, Ubuntu, stock EL).
3. Legacy only - vm.vfs_cache_min_ratio (CloudLinux 6 / 6 Hybrid / 7 only):
This CloudLinux-kernel-specific tunable exists only on CL6, CL6 Hybrid, and CL7 and is absent on CL7 Hybrid, CL8, CL9, CL10, Ubuntu, and any non-CloudLinux kernel (you will see
sysctl: cannot stat /proc/sys/vm/vfs_cache_min_ratio: No such file or directory). CL6 and CL7 are end-of-life - for essentially all current servers, skip this and usevfs_cache_pressureinstead.
Only on a supported CL6/CL7 kernel 3.10.0-614.10.2.lve1.4.46 or higher you may set:
# sysctl -w vm.vfs_cache_min_ratio=0
It controls the minimum percentage of dentry/inode cache the kernel will not reclaim (default 2, which can be large when many cgroups exist). Do not set this on kernels older than lve1.4.46 - it will crash the server.
Path B - High / ever-growing SUnreclaim (kernel memory accounting)
If the unreclaimable part keeps growing (often with high sys CPU in kworker threads and periodic slowdowns/OOMs), this is not a cache-tuning problem, and vfs_cache_pressure / drop_caches will not help.
CloudLinux 7, lve1.5.7x kernels: this was a kernel-memory cgroup-accounting issue (ref CLKRN-1077, CLKRN-1150). Fix:
- Update the kernel to a build containing the fix - CL7:
3.10.0-962.3.2.lve1.5.80.el7or later — and reboot. - If still needed, disable cgroup kmem accounting via the CloudLinux tuned profile (boot parameter
cgroup.memory=nokmem):
# yum update tuned-profiles-cloudlinux # tuned-adm profile cloudlinux-default # reboot # required for the cgroup.memory=nokmem cmdline to take effect # tuned-adm active # confirm: cloudlinux-default # cat /proc/cmdline # confirm: ... cgroup.memory=nokmem
CloudLinux 8 / 9 / 10 (and stock EL / Ubuntu): the vfs_cache_min_ratio and cgroup.memory=nokmem tunables above do not apply. A persistently growing SUnreclaim here is a distinct kernel/livepatch investigation (e.g. a leaking slab cache from a specific driver or live patch), not a tuning task. Identify the dominant cache and escalate:
# slabtop -o | head -20 # find the slab cache that is growing (e.g. kmalloc-32, names_cache)
Collect slabtop -o, /proc/meminfo, kernel version, and (if the box has crashed) a vmcore, and escalate to the kernel team rather than tuning vfs_cache.
Quick reference
| Symptom | Path | Action |
|---|---|---|
SReclaimable high, frees under pressure | A | vfs_cache_pressure ↑ (start 200, all OS); drop_caches for instant relief |
SReclaimable high, CL6/CL7 only | A | optionally vfs_cache_min_ratio=0 (legacy, kernel ≥ lve1.4.46) |
SUnreclaim high / always growing, CL7 lve1.5.7x | B | kernel update ≥ lve1.5.80 + cgroup.memory=nokmem |
SUnreclaim high / always growing, CL8/9/10 | B | slabtop -o to identify the cache; escalate to kernel team |
Comments
0 comments
Please sign in to leave a comment.