( 案例) Tuning OS performace kernel.sem cause high %sys CPU
去年的blog《如何在 Linux 上诊断高 Sys CPU》 记录过%sys CPU高与oracle相关常见的2个情况 , 这次刚好遇到了因OS内核参数配置的kernel.sem信号量产生CPU高的案例,记录一下现象。 环境Oracle Exadata X8。
Top
top - 11:21:54 up 110 days, 17:07, 4 users, load average: 73.86, 82.52, 95.98 Tasks: 16414 total, 74 running, 8744 sleeping, 0 stopped, 1 zombie %Cpu(s): 44.9 us, 21.1 sy, 0.0 ni, 27.2 id, 0.0 wa, 1.6 hi, 5.2 si, 0.0 st KiB Mem : 15834657+total, 29335952+free, 10968253+used, 19328089+buff/cache KiB Swap: 16777212 total, 16777212 free, 0 used. 41978544+avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 128240 oracle 20 0 518.0g 2.7g 425360 S 215.9 0.2 23062:03 ora_scmn_anbob 128242 oracle 20 0 533.9g 2.6g 400984 S 206.9 0.2 21456:04 ora_scmn_anbob 225463 oracle 20 0 265.4g 5.4g 1.3g S 125.6 0.4 33581:02 ora_scmn_weejar 225461 oracle 20 0 261.7g 4.8g 880292 S 122.5 0.3 32100:05 ora_scmn_weejar 284542 oracle 20 0 514.8g 219388 144488 R 85.9 0.0 299:55.71 ora_j002_anbob 128349 oracle 20 0 523.9g 192292 88264 S 56.8 0.0 6614:04 ora_lgwr_anbob 128288 oracle 20 0 511.8g 171812 84296 S 53.6 0.0 2366:53 ora_dbw1_anbob 128306 oracle 20 0 511.8g 171248 84056 R 51.9 0.0 2347:51 ora_dbw5_anbob
pidstat -p 128240 xx 1 10
11:24:09 AM UID PID %usr %system %guest %CPU CPU Command
11:24:10 AM 1001 128240 78.00 100.00 0.00 100.00 4 ora_scmn_anbob
11:24:11 AM 1001 128240 73.00 100.00 0.00 100.00 4 ora_scmn_anbob
11:24:12 AM 1001 128240 84.00 100.00 0.00 100.00 67 ora_scmn_anbob
11:24:13 AM 1001 128240 80.00 100.00 0.00 100.00 67 ora_scmn_anbob
11:24:14 AM 1001 128240 78.00 100.00 0.00 100.00 67 ora_scmn_anbob
11:24:15 AM 1001 128240 77.00 100.00 0.00 100.00 4 ora_scmn_anbob
11:24:16 AM 1001 128240 100.00 0.00 0.00 100.00 4 ora_scmn_anbob
11:24:17 AM 1001 128240 100.00 0.00 0.00 100.00 4 ora_scmn_anbob
11:24:18 AM 1001 128240 100.00 0.00 0.00 100.00 4 ora_scmn_anbob
11:24:19 AM 1001 128240 100.00 0.00 0.00 100.00 4 ora_scmn_anbob
top -Hp
top -Hp top - 16:31:58 up 110 days, 22:17, 3 users, load average: 61.83, 85.55, 84.47 Threads: 10 total, 2 running, 8 sleeping, 0 stopped, 0 zombie %Cpu(s): 42.3 us, 14.6 sy, 0.0 ni, 37.4 id, 0.0 wa, 1.4 hi, 4.4 si, 0.0 st KiB Mem : 15834657+total, 31711164+free, 10711178+used, 19523624+buff/cache KiB Swap: 16777212 total, 16777212 free, 0 used. 44549539+avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 128244 oracle -2 0 518.0g 2.8g 425664 R 63.8 0.2 7535:39 ora_lms0_anbob 128252 oracle -2 0 518.0g 2.8g 425664 R 63.1 0.2 7490:48 ora_lms4_anbob 128251 oracle -2 0 518.0g 2.8g 425664 S 58.8 0.2 7564:12 ora_lms2_anbob 128457 oracle 20 0 518.0g 2.8g 425664 S 1.0 0.2 214:00.35 ora_rs00_anbob 128460 oracle 20 0 518.0g 2.8g 425664 S 1.0 0.2 214:51.72 ora_rs02_anbob 128458 oracle 20 0 518.0g 2.8g 425664 S 0.7 0.2 113:58.94 ora_cr00_anbob 128459 oracle 20 0 518.0g 2.8g 425664 S 0.7 0.2 215:21.29 ora_rs04_anbob 128456 oracle 20 0 518.0g 2.8g 425664 S 0.3 0.2 109:47.08 ora_cr00_anbob 128240 oracle 20 0 518.0g 2.8g 425664 S 0.0 0.2 0:18.65 ora_scmn_anbob 128452 oracle 20 0 518.0g 2.8g 425664 S 0.0 0.2 108:23.32 ora_cr00_anbob
strace
# strace -f -c -p 128240
strace: Process 128240 attached with 10 threads
strace: Process 128240 detached
strace: Process 128244 detached
strace: Process 128251 detached
strace: Process 128252 detached
strace: Process 128452 detached
strace: Process 128456 detached
strace: Process 128457 detached
strace: Process 128458 detached
strace: Process 128459 detached
strace: Process 128460 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
82.04 10.681294 1976 5405 3196 semtimedop
8.90 1.158505 14 78221 epoll_wait
6.06 0.789445 3 224378 73934 recvmsg
2.46 0.319900 4 66242 sendmsg
0.35 0.045048 6 6922 semop
0.19 0.025336 2 10534 1 read
0.00 0.000250 27 9 semctl
0.00 0.000150 18 8 rt_sigprocmask
0.00 0.000095 5 18 poll
0.00 0.000066 1 47 getrusage
0.00 0.000000 0 5 sched_yield
------ ----------- ----------- --------- --------- ----------------
100.00 13.020089 391789 77131 total
ipcs
# ipcs -l ------ Messages Limits -------- max queues system wide = 2878 max size of message (bytes) = 8192 default max size of queue (bytes) = 65536 ------ Shared Memory Limits -------- max number of segments = 4096 max seg size (kbytes) = 1345946919 max total shared memory (kbytes) = 1345946916 min seg size (bytes) = 1 ------ Semaphore Limits -------- max number of arrays = 256 max semaphores per array = 1024 max semaphores system wide = 70000 max ops per semop call = 1024 semaphore max value = 32767 # ipcs -u ------ Messages Status -------- allocated queues = 0 used headers = 0 used space = 0 bytes ------ Shared Memory Status -------- segments allocated 16 pages allocated 220840971 pages resident 220840962 pages swapped 0 Swap performance: 0 attempts 0 successes ------ Semaphore Status -------- used arrays = 58 allocated semaphores = 58492
Cause
从上面的数据得知该案例的kernel.sem 参数应该为32767 70000 1024 256, 而且据说是XD的标准配置,大家可以检查一下自己X8的环境是否是该配置,该环境是有2个db实例。
kernel.sem对应的4组数据 SEMMSL SEMMNS SEMOPM SEMMNI
SEMMSL: maximum number of semaphores per id
SEMMNS: maximum number of semaphores, system wide.
SEMOPM: maximum number of operations in one semop call.
SEMMNI: number of semaphore identifiers (or arrays), system wide.
Doc ID 1670658.1有记录How high can we set our semaphore limits on the Linux Platform.
Please refer the below table which represents semaphore limits in general:
NAME | DESCRIPTION | POSSIBLE RANGE OF VALUES --------+---------------------------------------------------+-------------------------- SEMMSL* | maximum number of semaphores in a semaphore set | 1 – 65536 SEMMNS* | maximum number of semaphores in the system | 1 – 2147483647 SEMOPM | maximum number of operations per semop(P) call | 100 SEMMNI* | maximum number of semaphore sets in system | 1 – 32768
anbob的Oracle环境建议
操作系统参数名 | 当前值 | Oracle推荐值 | 数据库级 |
Hugepage | ENABLE | SGA | |
Transparent HugePages | DISABLE | ||
Disk I/O Scheduler | # cat /sys/block/${ASM_DISK}/queue/scheduler noop [deadline] cfq |
||
NUMA | DISABLE | ||
Shell Limits for the Grid and Oracle User | /etc/security/limits.d/*.conf | PROCESSES | |
semmsl | 250 | PROCESSES | |
semmns | >= 32000(sum of the “processes” parameter for each instance *2) | PROCESSES | |
semopm | 100 | ||
semmni | 128-256 number of instances running simultaneously; |
Doc ID 2707048.1 有记录增加SEMOPM sys CPU的增涨,同时semmsl也是一个关键值,该值的使用:
1,请求需要信号processes+每实例额外消耗4
2, processes为1个信号集,如果semmsl 3,计算每个信号集信号个数,使用(processes+4)每次除以2,直到小于semmsl
4,计算信息集个数processes+4*sets / #3的个数
5, 但是每个信号集只有1 lock,所以semmsl建议配置为250,分配多而小的信号集,避免性能问题
Solution:
建议调整kernel.sem为我建议的值。
对不起,这篇文章暂时关闭评论。