When you see a high load average, however, no real CPU consuming processes, but with kondemand process in top command, then most probably CPU scaling is the core of the issue.
Example 'top' output:
top - 22:22:07 up 15 min, 3 users, load average: 5.30, 4.03, 2.60 Tasks: 1022 total, 1 running, 1021 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.1%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 32789164k total, 1858284k used, 30930880k free, 66368k buffers Swap: 4194296k total, 0k used, 4194296k free, 798340k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4979 root 20 0 153m 7364 1780 S 2.6 0.0 0:15.20 lvestats-server 4024 root 20 0 0 0 0 S 2.3 0.0 0:18.44 kondemand/23 4025 root 20 0 0 0 0 S 2.3 0.0 0:19.45 kondemand/24 4023 root 20 0 0 0 0 S 2.0 0.0 0:14.85 kondemand/22 4026 root 20 0 0 0 0 S 2.0 0.0 0:17.60 kondemand/25 4027 root 20 0 0 0 0 S 2.0 0.0 0:16.24 kondemand/26 4022 root 20 0 0 0 0 S 1.6 0.0 0:12.68 kondemand/21 4028 root 20 0 0 0 0 S 1.6 0.0 0:13.79 kondemand/27
Check what scaling is set to:
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
If you see kondemand there, it means that scaling is enabled. It actually enables the operating system to scale the CPU frequency up or down in order to save power. CPU frequencies will be scaled automatically depending on the system load, in response to ACPI events, or manually by userspace programs. To be sure CPU has been scaled you may compare real CPU frequency to hardware:
grep -E '^model name|^cpu MHz' /proc/cpuinfo
model name : Intel(R) Xeon(R) CPU X7550 @ 2.00GHz cpu MHz : 1067.000
Resolution is simple - switch off scaling, this should be done for all CPU cores, but here is one-liner for it:
for CPUFREQ in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do [ -f $CPUFREQ ] || continue; echo -n performance > $CPUFREQ; done
This will help on an already running server, however, to resolve it completely you have to remove cpuspeed rpm package.
Comments
0 comments
Please sign in to leave a comment.