Troubleshooting High “enq: TX – index contention” events during index split?
enq: TX – index contention
经常发生在应用程序的并发性很高的表相关联的索引上,索引叶块、枝块争用较高,尤其当应用程序执行许多INSERT和DELETE时,通常会发生这种情况。原因是在向索引插入新行时索引块分裂,事务将必须在模式4下等待TX锁定,直到执行块拆分的会话完成操作为止, 常伴随[gc] buffer busy waits
常见原因
1. 从应用程序大量访问的表上的索引。
2. 表列上的索引的值通过单调递增插入。
3. 大量删除
分析方法
1, 从v$session 或 ASH, DASH中找SQL及对象
2, V$SEGMENT_STATISTICS 或 ‘Segments by Row Lock Waits’ ‘Segments by Buffer Busy Waits’ AWR 报告查找对象
select * from v$segment_statistics where statistic_name ='row lock waits'
3, v$sysstat 或 AWR中的index split
解决方法
1. Reverse Key Indexes
可参考 反转索引主键 语法
2. Hash partitioned global indexes
3. Increase CACHE size of the sequences, and NOORDER.
4. Recreate index or drop index if not need.
5. Sequence SCALE EXTEND 18c new
可参考Oracle 12.2.0.2/Oracle 18.1 新特性: Sequence SCALE EXTEND ?
下面是一个案例, 使用了Poder的sql tools
先看TOP EVENT
SQL> @snapper ash 5 1 all Sampling SID all with interval 5 seconds, taking 1 snapshots... -- Session Snapper v4.14 BETA - by Tanel Poder ( http://blog.tanelpoder.com ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :) ---------------------------------------------------------------------------------------------------- Active% | INST | SQL_ID | SQL_CHILD | EVENT | WAIT_CLASS ---------------------------------------------------------------------------------------------------- 5091% | 1 | gakpd2xn7k0zr | 2 | enq: TX - index contention | Concurrency 5030% | 1 | dr78bvcrncm6j | 2 | enq: TX - index contention | Concurrency 3613% | 1 | 1rrm8am4g11yy | 0 | enq: TX - index contention | Concurrency 2543% | 1 | dr78bvcrncm6j | 2 | buffer busy waits | Concurrency 2270% | 1 | gakpd2xn7k0zr | 2 | buffer busy waits | Concurrency 2191% | 1 | 3048yhcsmycu9 | 2 | enq: TX - index contention | Concurrency 1978% | 1 | dr78bvcrncm6j | 2 | latch: ges resource hash list | Other 1904% | 1 | gakpd2xn7k0zr | 2 | latch: ges resource hash list | Other 1865% | 1 | 1rrm8am4g11yy | 0 | buffer busy waits | Concurrency 1230% | 1 | 1rrm8am4g11yy | 0 | latch: ges resource hash list | Other SQL> @sw "select sid from v$session where event like 'enq: TX%'" SID STATE EVENT SEQ# SEC_IN_WAIT P1 P2 P3 P1TRANSL ------- ------- ---------------------------------------- ---------- ----------- ------------------- ------------------- ------------------- ------------------------------------------ 3036 WAITING enq: TX - allocate ITL entry 17573 0 name|mode= usn<<16 | slot= sequence= 33872 0x54580004: TX mode 4 0x0000000054580004 27590686 28 WAITING enq: TX - index contention 4079 0 name|mode= usn<<16 | slot= sequence= 33872 0x54580004: TX mode 4 0x0000000054580004 27590686 526 WAITING enq: TX - index contention 40938 0 name|mode= usn<<16 | slot= sequence= 33872 0x54580004: TX mode 4 0x0000000054580004 27590686 534 WAITING enq: TX - index contention 21024 0 name|mode= usn<<16 | slot= sequence= 33872 0x54580004: TX mode 4 0x0000000054580004 27590686 572 WAITING enq: TX - index contention 12879 0 name|mode= usn<<16 | slot= sequence= 33872 0x54580004: TX mode 4 0x0000000054580004 27590686 575 WAITING enq: TX - index contention 8875 0 name|mode= usn<<16 | slot= sequence= 33872 0x54580004: TX mode 4 ... SQL> @ashtop event,CURRENT_OBJ#,p1 1=1 sysdate-10/24/60 sysdate Total Seconds AAS %This EVENT CURRENT_OBJ# P1 FIRST_SEEN LAST_SEEN COUNT(DISTINCTSQL_EXEC_START||':'||SQL_EXEC_ID) --------- ------- ------- ---------------------------------------- ------------ ---------- ------------------- ------------------- ----------------------------------------------- 79655 132.8 46% | enq: TX - index contention 664395 1415053316 2020-07-01 09:10:52 2020-07-01 09:20:51 79484 15130 25.2 9% | buffer busy waits 664395 365 2020-07-01 09:10:52 2020-07-01 09:20:43 15128 9536 15.9 5% | buffer busy waits 664395 351 2020-07-01 09:13:35 2020-07-01 09:20:42 9536 4917 8.2 3% | buffer busy waits 664395 161 2020-07-01 09:10:52 2020-07-01 09:13:51 4915 3451 5.8 2% | buffer busy waits 664395 367 2020-07-01 09:10:59 2020-07-01 09:15:49 3449 3121 5.2 2% | buffer busy waits 664395 164 2020-07-01 09:17:43 2020-07-01 09:20:49 3121 2330 3.9 1% | buffer busy waits 664395 372 2020-07-01 09:10:55 2020-07-01 09:13:56 2330 2084 3.5 1% | 664395 1413697536 2020-07-01 09:10:52 2020-07-01 09:20:51 1434 1977 3.3 1% | buffer busy waits 664395 345 2020-07-01 09:12:28 2020-07-01 09:19:31 1977 1703 2.8 1% | buffer busy waits 664395 371 2020-07-01 09:10:53 2020-07-01 09:14:47 1703 906 1.5 1% | enq: TX - index contention 0 1415053316 2020-07-01 09:10:52 2020-07-01 09:20:51 906 631 1.1 0% | buffer busy waits 664395 373 2020-07-01 09:10:59 2020-07-01 09:13:43 631 553 .9 0% | buffer busy waits 664395 163 2020-07-01 09:19:42 2020-07-01 09:20:51 553 393 .7 0% | latch: ges resource hash list 664395 3.1417E+11 2020-07-01 09:11:20 2020-07-01 09:11:20 393 SQL> @sed "enq: TX" Show wait event descriptions matching %enq: TX%.. EVENT# EVENT_NAME WAIT_CLASS PARAMETER1 PARAMETER2 PARAMETER3 ENQUEUE_NAME REQ_REASON REQ_DESCRIPTION ------ ------------------------------------------------------- -------------------- ------------------------- ------------------------- ------------------------- ------------------------------ -------------------------------- ---------------------------------------------------------------------------------------------------- 241 enq: TX - row lock contention Application name|mode usn<<16 | slot sequence Transaction row lock contention Lock held on a particular row by a transaction to prevent other transactions from modifying it SQL> SQL> select ROW_WAIT_OBJ#,sql_id,count(*) from v$session where event like 'enq: TX - index conte%' group by ROW_WAIT_OBJ#,sql_id; ROW_WAIT_OBJ# SQL_ID COUNT(*) ------------- --------------- ---------- 0 1rrm8am4g11yy 1 664395 dr78bvcrncm6j 69 664395 gakpd2xn7k0zr 73 664395 3048yhcsmycu9 44 664395 c83c2zc0mr5kk 28 330491 dr78bvcrncm6j 1 664395 1rrm8am4g11yy 75 0 gakpd2xn7k0zr 4 当然这里SQL都是insert ,省略. 查看每分钟ASH 负载趋势及TOP EVENT SQL> select inst_id,etime,events,db_time,avg_waited,pct,aas,lpad('*',sum(aas) over(partition by inst_id,etime)/CPUS,'*') load 2 from ( 3 select inst_id,etime,nvl(event,'on cpu') events, lpad(dbtime,10,' ')||' s' db_time,round(dbtime/24) AAS, avg_waited,(select param.value from gv$parameter param where param.name like '%cpu_count%' and param.inst_id=v.inst_id ) cpus 4 ,lpad(round(100*ratio_to_report(dbtime) OVER (partition by etime ),2)||'%',15,' ') pct,row_number() over(partition by etime order by dbtime desc) rn 5 from ( 6 select inst_id,to_char(SAMPLE_TIME,'yyyymmdd hh24:mi') etime,event, count(*) dbtime, lpad(round(avg(time_waited)/1000,2)||' ms',20,' ') avg_waited 7 from gv$active_session_history gash 8 where SESSION_TYPE='FOREGROUND' AND inst_id=1 9 and sample_time between to_date('2020-7-1 08:30','yyyy-mm-dd hh24:mi') and to_date('2020-7-1 09:20','yyyy-mm-dd hh24:mi') 10 group by inst_id,to_char(SAMPLE_TIME,'yyyymmdd hh24:mi'),event 11 ) v 12 ) where rn<=5 13 order by 1,2; INST_ID ETIME EVENTS DB_TIME AVG_WAITED PCT AAS LOAD ---------- -------------- ----------------------------------- ------------ -------------------- --------------- ------- -------------------------------------------------------------------------------- 1 20200701 09:09 enq: TX - index contention 3018 s 80.34 ms 52.04% 126.0 ********** latch: ges resource hash list 1694 s 27.6 ms 29.21% 71.0 ********** buffer busy waits 820 s 33.1 ms 14.14% 34.0 ********** on cpu 123 s 0 ms 2.12% 5.0 ********** db file sequential read 102 s 2.2 ms 1.76% 4.0 ********** 20200701 09:10 enq: TX - index contention 15363 s 80.64 ms 56.1% 640.0 *********************************************** buffer busy waits 6216 s 43.13 ms 22.7% 259.0 *********************************************** latch: ges resource hash list 4444 s 27.41 ms 16.23% 185.0 *********************************************** db file sequential read 637 s 22.81 ms 2.33% 27.0 *********************************************** on cpu 511 s 0 ms 1.87% 21.0 *********************************************** 20200701 09:11 enq: TX - index contention 13142 s 88.18 ms 52.61% 548.0 ****************************************** buffer busy waits 5457 s 36.4 ms 21.85% 227.0 ****************************************** latch: ges resource hash list 4725 s 23.83 ms 18.92% 197.0 ****************************************** db file sequential read 577 s 9.64 ms 2.31% 24.0 ****************************************** on cpu 575 s 0 ms 2.3% 24.0 ****************************************** 20200701 09:12 enq: TX - index contention 13089 s 85.93 ms 52.71% 545.0 ****************************************** buffer busy waits 5600 s 34.17 ms 22.55% 233.0 ****************************************** latch: ges resource hash list 4877 s 20.19 ms 19.64% 203.0 ****************************************** on cpu 535 s 0 ms 2.15% 22.0 ****************************************** db file sequential read 507 s 10.79 ms 2.04% 21.0 ******************************************
查找对象
SQL> @oid 664395 owner object_name object_type SUBOBJECT_NAME CREATED LAST_DDL_TIME status DATA_OBJECT_ID ------------------------- ------------------------------ ------------------ ------------------------------ ----------------- ----------------- --------- -------------- SYS IND_LOC_CH_ATTR_OID INDEX 20191028 09:48:16 20200401 08:06:19 VALID 827128 SQL> @ind IND_LOC_CH_ATTR_OID Display indexes where table or index name matches %IND_LOC_CH_ATTR_OID%... TABLE_OWNER TABLE_NAME INDEX_NAME POS# COLUMN_NAME DSC -------------------- ------------------------------ ------------------------------ ---- ------------------------------ ---- ANBOB_COM TEST_TABLE_XXXXXXXXXXXXXXX IND_LOC_CH_ATTR_OID 1 OID SQL> @ind ANBOB_COM.TEST_TABLE_XXXXXXXXXXXXXXX Display indexes where table or index name matches %ANBOB_COM.TEST_TABLE_XXXXXXXXXXXXXXX%... TABLE_OWNER TABLE_NAME INDEX_NAME POS# COLUMN_NAME DSC -------------------- ------------------------------ ------------------------------ ---- ------------------------------ ---- ANBOB_COM TEST_TABLE_XXXXXXXXXXXXXXX IDX_DEFER_ATTR 1 OID 2 REGION 3 MONTHCYCLE IND_LOC_CH_ATTR_OID 1 OID INDEX_OWNER TABLE_NAME INDEX_NAME IDXTYPE UNIQ STATUS PART TEMP H LFBLKS NDK NUM_ROWS CLUF LAST_ANALYZED DEGREE VISIBILIT -------------------- ------------------------------ ------------------------------ ---------- ---- -------- ---- ---- -- ---------- ------------- ---------- ---------- ----------------- ------ --------- ANBOB_COM TEST_TABLE_XXXXXXXXXXXXXXX IDX_DEFER_ATTR NORMAL NO N/A YES N 4 3056508 274999605 274999605 99148522 20191106 17:05:56 1 VISIBLE SYS TEST_TABLE_XXXXXXXXXXXXXXX IND_LOC_CH_ATTR_OID NORMAL NO VALID NO N 4 1235701 144476500 369544890 147821098 20200401 08:06:19 1 INVISIBLE
SYS用户创建的索引,貌似多余。
SQL> select ROW_WAIT_OBJ#,ROW_WAIT_FILE#,ROW_WAIT_BLOCK#
from v$session where event like 'enq: TX - index conten%' ;
ROW_WAIT_OBJ# ROW_WAIT_FILE# ROW_WAIT_BLOCK#
------------- -------------- ---------------
664395 161 805296
664395 161 805296
664395 365 2622470
664395 161 805296
664395 161 805296
664395 161 805296
664395 161 805296
664395 161 805296
664395 161 805296
...
dump index block
alter system dump datafile 161 block 805296;
show trace
Block header dump: 0x284c49b0 Object id on Block? Y seg/obj: 0xc9ef8 csc: 0xf3b.17ee833f itc: 168 flg: E typ: 2 - INDEX brn: 0 bdba: 0x284c2d87 ver: 0x01 opc: 0 inc: 0 exflg: 0 Itl Xid Uba Flag Lck Scn/Fsc 0x01 0x0143.014.0000903c 0x540444a2.172f.06 --U- 1 fsc 0x0000.17ee8340 0x02 0x0ccf.004.0000257b 0x5400e74c.0606.0c C-U- 0 scn 0x0f3b.160605c3 0x03 0x075b.000.00008393 0x544051ca.19d3.12 C-U- 0 scn 0x0f3b.160605c3 0x04 0x0864.00e.00001b02 0x5400a089.03ac.16 C--- 0 scn 0x0f3b.1606bf97 0x05 0x03ae.011.0000188c 0x54006346.0806.54 C-U- 0 scn 0x0f3b.160605c3 0x06 0x03a4.00b.00001179 0x540018ea.0608.14 C-U- 0 scn 0x0f3b.160605c3 0x07 0x0d0d.012.0000247f 0x54004cc2.0615.54 C-U- 0 scn 0x0f3b.160605c3 0x08 0x0790.006.00003ddf 0x53c010b2.0e22.1c C-U- 0 scn 0x0f3b.160605c3 0x09 0x07a8.01f.0000b9ed 0x54405bb3.169d.43 C-U- 0 scn 0x0f3b.160605c3 0x0a 0x0d18.014.00002404 0x53c03cfa.07c9.51 C-U- 0 scn 0x0f3b.160605c3 0x0b 0x0792.00f.0000a34a 0x01006300.1602.3e C-U- 0 scn 0x0f3b.160605c3 0x0c 0x03c9.01c.00001277 0x53c07097.06da.44 C-U- 0 scn 0x0f3b.160605c3 0x0d 0x077a.020.000259bf 0x01003845.38b6.43 C-U- 0 scn 0x0f3b.160605c3 0x0e 0x0697.001.0004fad6 0x01146d1b.59e8.02 C-U- 0 scn 0x0f3b.160605c3 0x0f 0x070b.006.0001a796 0x5400ec8a.1e18.09 C-U- 0 scn 0x0f3b.160605c3 0x10 0x0692.007.00041fa3 0x54004d13.5363.4c C-U- 0 scn 0x0f3b.160605c3 0x11 0x06d4.003.0005548d 0x54405aa6.670d.11 C-U- 0 scn 0x0f3b.160605c3 0x1d 0x03c7.017.00004a81 0x542716a2.1327.05 C--- 0 scn 0x0f3b.16071c78 0x1e 0x0826.015.00000f4e 0x010fdf6b.0238.06 C--- 0 scn 0x0f3b.16071f24 0x1f 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 0x20 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 0x21 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 0x22 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 0x23 0x0000.000.00000000 0x00000000.0000.00 ---- 0 fsc 0x0000.00000000 Leaf block dump =============== header address 140294501911028=0x7f98dbf349f4 kdxcolev 0 KDXCOLEV Flags = - - - kdxcolok 1 kdxcoopc 0x89: opcode=9: iot flags=--- is converted=Y kdxconco 2 kdxcosdc 3 kdxconro 0 kdxcofbo 36=0x24 kdxcofeo 4048=0xfd0 kdxcoavs 4012 <<<<<<<<<<<<<<<<<<<<< kdxlespl 0 kdxlende 0 kdxlenxt 676083785=0x284c3849 kdxleprv 676085935=0x284c40af kdxledsz 0 kdxlebksz 4048 <<<<<<<<<<<<<<<<<<<<< *** dummy key *** row#0[2245] flag: ---D--, lock: 0, len=22 col 0; len 8; (8): c7 59 24 26 39 33 29 21 col 1; len 10; (10): 00 05 0b 04 65 03 a0 e0 00 5e
TIP:
从物理块dump出来索引块的信息显示,kdxcolev =0表示这是一个leaf block, itc: 168 表示这一个block中分配了168个itl事务槽. itc事务槽系统默认叶块为2个,根支块为1个, 也可以在创建时指定initial参数预分配事务槽个数,因为事务槽的扩展也要消耗一定的性能代价,所以事务槽在初始化后,如果后期高并发把事务槽增长到了某个值, oracle认为这个索引上就有这么高的并发请求,后期分裂出来的新block也会继承原block的itc事务槽数。Itl 在当前版本没有办法手动指定上限值,实际block上的ITL最大上限为: min(255,half the block), 也就是说取255和一半block空间除以每个ITL大小后的个数对比后较小的数值, 这就依赖于block大小,对于我们8K的block, 最大也就是169.
根据index segment head block#, 加1就是index block root block, 也可以dump index root block , root block split 耗时比leaf更长。没有太好的方法统计什么时间root split或 itl 增长上去的,从ASH 可以找当时的block dump 尝试。
每个ITL通常占用24 bytes, 可以看到上面的BLOCK一半的空间都被itl所占用, 这也是上面我们看到的索引段空间浪费的原因。 kdxlebksz 4048 表示该block的可用空间4048 bytes, 但是我们的块大小是8192 bytes, 因为ITL占用了一半空间,后期分裂出来的块可用数据空间更少也,加速了index block split的发生, kdxcoavs 1920当前block 还有4012的空闲空间,是一个新split出来的block 只有一个key。
这个案例简单,因为是ID 列,且前个前置复合索引列,这个索引可以删除,问题消失。如果不可以删对于ID 范围查询较少,可以改为hash。 当然要重建index,回收ITL浪费 的空间。 itl不会自动回收。
对不起,这篇文章暂时关闭评论。