Troubleshooting high “gc current grant 2-way” and “gc cr grant 2-way” 案例(二)
很久以前遇到过一个案例,也是一套RAC环境出现了非常高的GC cr/current grant 2-way 等待事件, 《Performance tuning ‘gc cr¤t grant 2-way’ event (当主机扩容cpu后)》 当是这个案例是因为同一cluster的两个实例主机不同的CPU个数导致出现了不同的LMS进程数,负载能力不均衡导致, 这次要分享的案例是另个一种特殊情况。
持续较高的GC类等待,
NODE1
Event | Waits | Time(s) | Avg wait (ms) | % DB time | Wait Class |
---|---|---|---|---|---|
enq: TX – index contention | 27,057 | 32,531 | 1202 | 25.09 | Concurrency |
DB CPU | 26,320 | 20.30 | |||
enq: TX – row lock contention | 165,277 | 9,237 | 56 | 7.12 | Application |
gc current grant 2-way | 61,769 | 9,208 | 149 | 7.10 | Cluster |
log file sync | 1,937,689 | 6,747 | 3 | 5.20 | Commit |
NODE2
Top 5 Timed Foreground Events
Event | Waits | Time(s) | Avg wait (ms) | % DB time | Wait Class |
---|---|---|---|---|---|
gc buffer busy acquire | 84,112 | 31,365 | 373 | 35.95 | Cluster |
gc current grant 2-way | 24,906 | 11,217 | 450 | 12.86 | Cluster |
gc current block 2-way | 71,984 | 8,190 | 114 | 9.39 | Cluster |
gc cr grant 2-way | 34,748 | 6,061 | 174 | 6.95 | Cluster |
enq: TX – index contention | 3,067 | 4,735 | 1544 | 5.43 | Concurrency |
如果从节点1看会认为是并发争用enq index/row lock contention;还有gc cr/current grant 2-way, 该event表示Block is not in any cache, Permission granted to read from the disk. grant是message, block是data。
在诊断RAC问题时有几个黄金规则:
1. 谨防被TOP Event所欺骗
在RAC环境中有可能根本问题在另一个节点中隐藏,基于不会显示在top 5 event中, 被影响的等待事件可能比根本原因等待事件本身还要高, 实例的性能错综复杂, 很可能一个节点的后台进程出现问题(如LMS),而影响了其它所有节点出现Application class等待事件。
gc buffer busy xx 等待事件是RAC性能分析根本问题之前的主要证状, 简单认为在等待一个BLOCK的BL lock请求, 其它所有进程在等待这个BLOCK的BL lock可用其间的时间消耗都会记入gc buffer busy xx 等待事件中.
ASH和dump block trace 文件可以定位GC BUFFER BUSY等待, 主要是识别对象,对象类型,块类型信息
Object type | Block type | Possible issue(s) |
---|---|---|
Table | Segment header | Freelists, freelist groups, deletes from one node,ASSM bugs etc |
Table | Segment blocks | Heavy insert workload, heavy update to few blocks, SQL performance issues scanning few objects aggressively. |
Index | Leaf block | Numerous inserts on indexed columns, sequence or sysdate generated keys |
Index | Root / branch block |
Insert + numerous index scans, concurrent inserts and deletes etc |
Undo | Undo header block | Numerous short transactions |
Undo | Undo block | Long pending transaction and CR block generation |
Dictionary | SEQ$ (object) | Sequences with nocache or order set and excessive access to sequence. |
2. 消除基础架构环境问题
Gc cr grants 2-way 如果BLOCK没有在任何实例的Buffer cache中,时间会记入这个等待事件中,前台进程发送一个请求给远程LMS, 远程的LMS回复‘read from disk’. 出现gc 2-way 3-way这类意为着高并发出现的拥挤产生的等待,一个block和message传输的最小时间通常<2ms, (Exadata<0.5ms). 需要查看 AWR中的wait event histogram 查看2/3- way事件不同时间段的占比。排除网络问题
3. 识别问题实例
SELECT INSTANCE ||'->’ || inst_id transfer, class, cr_block cr_blk, Trunc(cr_block_time / cr_block / 1000, 2) avg_Cr, current_block cur_blk, Trunc(current_block_time / current_block / 1000, 2) avg_cur FROM gv$instance_cache_transfer WHERE cr_block > 0 AND current_block > 0 ORDER BY INSTANCE, inst_id, class /
4. 检查发送端的性能指标,不只是接收端
发送端
gc cr block receive time=
Time to send message to a remote LMS process by FG
+ Time taken by LMS to build block (statistics: gc cr block build time)
+ LMS wait for LGWR latency ( statistics: gc cr block flush time)
+ LMS send time (Statistics: gc cr block send time)
+ Wire latency.
- CR block request time = build time + flush time + send time
CR block request time is the time it takes to build the CR block in an instance that owns the appropriate image and the time to flush it, you have to write to disk , and how long it takes to send it across theinterconnect.
- Current block request time = pin time + flush time + send time
Current block request time is how long it takes to pin the image in an instance that owns the block image and time it takes to flush it and send it across, because you cannot send it while some is changing that block at the same time. That is why you need to pin the block in exclusive mode then flush it and send it over the interconnect.
5. 用柱状图去识别异常情况,而不只是平均值
AWR中的wait event histogram
案例分析
接着分析一开始AAS 上显示GC wait问题案例,这是一个11.2.0.3 2-NODES RAC ON HPUX 11.31的数据库,
# instance 1
Global Cache Load Profile
Per Second | Per Transaction | |
---|---|---|
Global Cache blocks received: | 1,018.74 | 1.08 |
Global Cache blocks served: | 78.53 | 0.08 |
GCS/GES messages received: | 904.33 | 0.96 |
GCS/GES messages sent: | 2,532.56 | 2.68 |
DBWR Fusion writes: | 3.07 | 0.00 |
Estd Interconnect traffic (KB) | 18,227.67 |
Global Cache Efficiency Percentages (Target local+remote 100%)
Buffer access – local cache %: | 99.48 |
Buffer access – remote cache %: | 0.39 |
Buffer access – disk %: | 0.14 |
Global Cache and Enqueue Services – Workload Characteristics
Avg global enqueue get time (ms): | 2.9 |
Avg global cache cr block receive time (ms): | 6.8 |
Avg global cache current block receive time (ms): | 3.8 |
Avg global cache cr block build time (ms): | 0.0 |
Avg global cache cr block send time (ms): | 0.0 |
Global cache log flushes for cr blocks served %: | 0.5 |
Avg global cache cr block flush time (ms): | 29.3 |
Avg global cache current block pin time (ms): | 1.4 |
Avg global cache current block send time (ms): | 0.0 |
Global cache log flushes for current blocks served %: | 2.1 |
Avg global cache current block flush time (ms): | 30.8 |
Global Cache and Enqueue Services – Messaging Statistics
Avg message sent queue time (ms): | 250.4 –issue instance |
Avg message sent queue time on ksxp (ms): | 6.8 |
Avg message received queue time (ms): | 3.0 |
Avg GCS message process time (ms): | 1.6 |
Avg GES message process time (ms): | 0.0 |
% of direct sent messages: | 43.52 |
% of indirect sent messages: | 45.21 |
% of flow controlled messages: | 11.27 |
TIPS:
AWR中的Average message queue time 数据来自 (G)V$GES_STATISTICS. 该值是把indirect message到message队列和发送它所需的时间(GCS/GES communication layer). 我们需要查看KSXP消息统计时间,以确定IPC层中是否存在问题。 IPC时间通常<1ms,您可以查询X$KSXPCLIENT以获取更多数据, 该值很高时也有可能在这段时间内CPU非常繁忙(100%CPU)或者 interconnect达到顶峰, 这点需要结合OSW确认。
# instance 2
Global Cache Load Profile
Per Second | Per Transaction | |
---|---|---|
Global Cache blocks received: | 96.24 | 0.75 |
Global Cache blocks served: | 1,043.19 | 8.17 |
GCS/GES messages received: | 3,607.20 | 28.26 |
GCS/GES messages sent: | 1,342.88 | 10.52 |
DBWR Fusion writes: | 3.09 | 0.02 |
Estd Interconnect traffic (KB) | 19,197.66 |
Global Cache Efficiency Percentages (Target local+remote 100%)
Buffer access – local cache %: | 99.43 |
Buffer access – remote cache %: | 0.20 |
Buffer access – disk %: | 0.37 |
Global Cache and Enqueue Services – Workload Characteristics
Avg global enqueue get time (ms): | 3.5 |
Avg global cache cr block receive time (ms): | 509.7 |
Avg global cache current block receive time (ms): | 96.6 |
Avg global cache cr block build time (ms): | 0.0 |
Avg global cache cr block send time (ms): | 0.0 |
Global cache log flushes for cr blocks served %: | 1.0 |
Avg global cache cr block flush time (ms): | 1.3 |
Avg global cache current block pin time (ms): | 0.2 |
Avg global cache current block send time (ms): | 0.0 |
Global cache log flushes for current blocks served %: | 0.1 |
Avg global cache current block flush time (ms): | 1.7 |
Global Cache and Enqueue Services – Messaging Statistics
Avg message sent queue time (ms): | 0.0 |
Avg message sent queue time on ksxp (ms): | 142.6 |
Avg message received queue time (ms): | 0.0 |
Avg GCS message process time (ms): | 0.0 |
Avg GES message process time (ms): | 0.0 |
% of direct sent messages: | 73.94 |
% of indirect sent messages: | 18.43 |
% of flow controlled messages: | 7.63 |
Tips:
Avg global cache cr block receive time (ms) 从请求实例发送消息到mastering instance花费的时间。这个时间包括在holding instance生成数据块一致性读映像的时间。CR数据块获取耗费的时间不应该大于15ms。
Avg global cache current block receive time (ms): 从请求实例发送消息到mastering instance(2-way get)和一些到holding instance (3-way get)花费的时间。这个时间包括holding instance日志刷新花费的时间。Current Block获取耗费的时间不大于30ms.
Avg message sent queue time on ksxp (ms)对端IPC层确认收到的消息并返回ACK所花费的时间,通常是个很小的毫秒值,直接反应了网络延迟,一般小于1ms, The send time is measured when a block is handed to the IPC layer to be sent. The requests are queued in a completion queue, and the send time is measured from when a block is handed off to the IPC layer until the system call returns a completion, i.e. a status indicating that the send request has been successfully passed on to the network and control returns to the cache layer.
如果该值比较大,可能是CLUSTER INTERCONNECT和LMS进程出现了问题。
做ipc诊断:
oradebug setmypid oradebug ipc
1)如果网络配置没有问题,可以检查一下TCP 和UDP的buffer是否不足
root#netstat -s | grep overflow –值增加=0一般情况下。
root#netstat -i
2) 查看负载ps aux | grep lms
查看是否是有某些表产生了过多的global cr request
select STATISTIC_NAME stat, OWNER, OBJECT_NAME obj, sum(value) val from v$segment_statistics where STATISTIC_NAME like 'global%' and value > 10000 group by STATISTIC_NAME,OWNER,OBJECT_NAME order by val desc;
2、查看一段时间内,CR的值是否很大。
select CR_REQUESTS cr, CURRENT_REQUESTS cur, DATA_REQUESTS data, UNDO_REQUESTS undo, TX_REQUESTS tx from v$cr_block_server;
确认问题节点
WAIT time | NODE1 | NODE2 |
---|---|---|
Avg global cache cr block receive time (ms) | 6.8 | 509.7 |
Avg global cache current block receive time (ms) | 3.8 | 96.6 |
Node:
RAC环境中,慢的节点会导致其他节点出现gc性能问题,根据这个逻辑,确认是节点1 出了问题。
#查看当时的LMS进程CPU使用
# node1 oracle@qdty1:/home/oracle:11G> ps -efl|head -n 1;ps -efl|grep lms F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME COMD 1401 S oracle 5117 1 237 178 20 e000001c934ba340 94816 e000000792a7f050 Jul 13 ? 15776:29 ora_lms1_tyjc1 1401 S grid 23810 1 0 178 20 e000001c9de0e040 90895 e000000792a7f1d0 Jan 25 ? 1550:30 asm_lms0_+ASM1 1401 R oracle 5119 1 254 178 20 e000001ce7c1b9c0 94816 - Jul 13 ? 16214:17 ora_lms2_tyjc1 401 S oracle 9366 7730 0 154 20 e000001cdb304c40 30 e000000da0da6472 10:50:38 pts/4 0:00 grep lms 1401 R oracle 5115 1 252 178 20 e000001cf7042a00 94752 - Jul 13 ? 16329:36 ora_lms0_tyjc1 # node 2 oracle@qdty2:/home/oracle:11G> ps -efl|head -n 1;ps -efl|grep lms|grep -v grep F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME COMD 1401 S grid 12516 1 0 178 20 e000001b43146400 90854 e000000792d71348 Jan 25 ? 1461:04 asm_lms0_+ASM2 1401 S oracle 6446 1 24 178 20 e000001ef09b0d00 95666 e000000792d710d0 Jul 13 ? 16447:18 ora_lms2_tyjc2 1401 S oracle 6444 1 30 178 20 e000001ebe6c7d00 95666 e000000792d71248 Jul 13 ? 17583:22 ora_lms1_tyjc2 1401 S oracle 6442 1 43 178 20 e000001e7bc97c80 95666 e000000792d71240 Jul 13 ? 17764:57 ora_lms0_tyjc2
Note:
当前为HPUX 11.31平台,节点1所以PS显示的C[PU] 列会超过100%现象, 但也表明了当前的CPU是非常繁忙。节点2是正常的。
# node1
Instance Activity Stats
Statistic | Total | per Second | per Trans |
---|---|---|---|
data blocks consistent reads – undo records applied | 3.7 TB | 1.1 GB | 1.1 MB |
rollback changes – undo records applied | 4,074,774 | 1,124.49 | 1.19 |
transaction tables consistent reads – undo records applied | 100.1 MB | 28.3 KB | 0 B
|
# node 2
Instance Activity Stats
Statistic | Total | per Second | per Trans |
---|---|---|---|
data blocks consistent reads – undo records applied | 4.6 GB | 2.7 MB | 21.6 KB |
rollback changes – undo records applied | 1,043 | 0.59 | 0.00 |
transaction tables consistent reads – undo records applied | 112.0 KB | 0 B | 0 B
|
如果不是应用SQL和网络问题,那为什么LMS会占用那么高的cpu呢? 去MOS中搜,不难发现,又命中了一个bug 14672057.
原因是因为从11.2.0.2版本开始为了解决之前的bug 9569316,而引入了_max_cr_rollbacks参数, 该参数默认为0。 表示最大cr block rolback blocks.没有找到过多的关于该参数的解释。如果当出现花费大量的时间在undo cr block生成时,把该参数调整为非0值,会启用light work 规则,有助于减少lms进程压力。 该bug只影响11.2.0.2和11.2.0.3, 解决解决方法可以动态的修改 “_max_cr_rollbacks”参数解决,所以没有再提供one-off patch, 该bug 在11.2.0.4版本已修复, 当执行完如下修改后,GC 等待和enq 瞬间消失,恢复了正常状态。所以问题不是节点1 top event中表现的index split问题。
SQL> alter system set "_max_cr_rollbacks"=200; System altered.
# 修改后LMS 进程 node1
oracle@qdty1:/home/oracle:11G> ps -efl|head -n 1;ps -efl|grep lms|grep -v grep F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME COMD 1401 S oracle 5117 1 55 178 20 e000001c934ba340 94816 e000000792a7f0d0 Jul 13 ? 15778:04 ora_lms1_tyjc1 1401 S grid 23810 1 0 178 20 e000001c9de0e040 90895 e000000792a7f350 Jan 25 ? 1550:30 asm_lms0_+ASM1 1401 S oracle 5119 1 76 178 20 e000001ce7c1b9c0 94816 e000000792a7f1c8 Jul 13 ? 16216:40 ora_lms2_tyjc1 1401 S oracle 5115 1 80 178 20 e000001cf7042a00 94752 e000000792a7f3c0 Jul 13 ? 16331:29 ora_lms0_tyjc1
对不起,这篇文章暂时关闭评论。