一日数据库多个会话出现较长时间的等待, 等待event是”library cache: mutex X”, 关于这个等待事件以前在post记录过,发生在SQL的解析阶段,在申请libary cache的锁时产生的等待,当时是部分应用session 被一些监控会话堵塞,这些会话正在执行v$sql ,v$sqlarea的收集统计,同时又被一个应用UPDATE会话堵塞,该会话当前是ON CPU,且已经执行了有近2万秒,正在执行动态采样的SQL。这里简单的记录一下该问题。
USERNAME SID EVENT MACHINE MODULE STATUS LAST_CALL_ET SQL_ID WAI_SECINW ROW_WAIT_OBJ# SQLTEXT BS
------ ----- -------------------- ---------- -------------------- -------- ------------ --------------- ---------- ------------- ------------------------------ ----------
ANBOB 7940 library cache: mutex kdhdw** JDBC Thin Client ACTIVE 2 7hjb2wh3nprf7 0:0 -1 SELECT sql_id, p.id, p.p 1:8558
ANBOB 8802 library cache: mutex kdhdw** JDBC Thin Client ACTIVE 3 7hjb2wh3nprf7 0:1 -1 SELECT sql_id, p.id, p.p 1:8558
ANBOB 13958 library cache: mutex kdhdw** JDBC Thin Client ACTIVE 3 7hjb2wh3nprf7 0:1 -1 SELECT sql_id, p.id, p.p 1:8558
ANBOB 9059 library cache: mutex kdhdw** JDBC Thin Client ACTIVE 3 7hjb2wh3nprf7 0:1 0 SELECT sql_id, p.id, p.p 1:8558
ANBOB 15308 library cache: mutex kdhdw** JDBC Thin Client ACTIVE 3 7hjb2wh3nprf7 0:2 -1 SELECT sql_id, p.id, p.p 1:8558
ANBOB 18159 library cache: mutex qmyy3** DebtSrvA2 ACTIVE 117 0:117 2962699 1:8558
ANBOB 19816 library cache: mutex kmibo** IntSrvA1 ACTIVE 150 0:151 2713842 1:8558
ANBOB 15579 library cache: mutex qmyy3** DebtSrvA2 ACTIVE 151 0:151 2962699 1:8558
ANBOB 14781 library cache: mutex qmyy3** DebtSrvA2 ACTIVE 160 0:161 2962699 1:8558
ANBOB 22403 library cache: mutex qmyy3** DebtSrvA2 ACTIVE 172 0:173 2540570 1:8558
ANBOB 22037 library cache: mutex qmyy3** DebtSrvA2 ACTIVE 195 0:196 2540570 1:8558
ANBOB 3995 library cache: mutex qmyy3** DebtSrvA2 ACTIVE 214 0:214 2500069 1:8558
ANBOB 1144 library cache: mutex qmyy3** DebtSrvA2 ACTIVE 223 0:223 2534163 1:8558
ANBOB 17837 library cache: mutex kmibo** IntSrvA1 ACTIVE 267 0:267 3615443 1:8558
ANBOB 17502 library cache: mutex qmyy3** DebtSrvA2 ACTIVE 272 0:272 2962699 1:8558
ANBOB 4069 library cache: mutex qmyy3** DebtSrvA2 ACTIVE 276 0:276 2962725 1:8558
ANBOB 19334 library cache: mutex kmibo** IntSrvA1 ACTIVE 279 0:280 7381736 1:8558
ANBOB 686 library cache: mutex qmyy3** DebtSrvA2 ACTIVE 280 0:280 2962699 1:8558
ANBOB 17847 library cache: mutex kmibo** IntSrvA1 ACTIVE 290 0:291 2714179 1:8558
ANBOB 950 library cache: mutex qmyy3** DebtSrvA2 ACTIVE 293 0:293 2997885 1:8558
ANBOB 6348 library cache: mutex kmibo** IntSrvA1 ACTIVE 326 0:326 7340283 1:8558
ANBOB 17605 library cache: mutex kmibo** IntSrvA1 ACTIVE 332 0:333 7340283 1:8558
ANBOB 12843 library cache: mutex kmibo** IntSrvA1 ACTIVE 361 0:57 2714179 1:8558
ANBOB 16463 library cache: mutex kmibo** IntSrvA1 ACTIVE 387 0:387 3070296 1:8558
ANBOB 9189 SQL*Net message from kinjk** SQL*Plus ACTIVE 395 gspzwbd8f3w4n 0:0 -1 INSERT INTO TMP_DAILY_MONITOR_ :
ANBOB 8299 library cache: mutex kmibo** IntSrvA1 ACTIVE 420 0:420 7340283 1:8558
ANBOB 5995 library cache: mutex kmibo** IntSrvA1 ACTIVE 430 0:430 2714179 1:8558
ANBOB 15684 library cache: mutex kmibo** IntSrvA1 ACTIVE 460 0:460 7340283 1:8558
ANBOB 3989 db file scattered re kinjk** LocBatMobileCardComb ACTIVE 563 78kf9ka6zuq1k 0:0 2534445 select sptr.subsid subsid,sptr :
ANBOB 12155 SQL*Net message from kinjk** SQL*Plus ACTIVE 585 880wsrnp3rfs9 0:0 -1 INSERT INTO TMP_DAILY_MONITOR :
ANBOB 10747 db file sequential r qdtza** oracle ACTIVE 1563 djyd82ug5fst6 0:0 2718525 SELECT COUNT(*) FROM "ANBOB"."T :
MONN 8558 library cache: mutex qdexa** JDBC Thin Client ACTIVE 9184 d7jswwg3nx45w 0:57 -1 select d.name dbame,dl.db_link 1:2087
APPS 2087 On CPU / runqueue kinjk** bdbudataproc ACTIVE 19900 4ncfrqavn7143 -1:19900 7309607 UPDATE BD_RECV_SPBIZIINFO_MID :
Tips:
当前的最终堵塞会话是sid=2098, 在执行update,无等待, last_call_et已经很长时间。恢复可以先采取KILL该会话。
SQL> @s 11618
SID SQLID_AND_CHILD STATUS STATE EVENT SEQ# SEC_IN_WAIT BLOCKING_SID P1 P2 P3 P1TRANSL
------- -------------------- -------- ------- ---------------------------------------- ---------- ----------- ------------ ------------------ ------------------ ------------------ ------------------------------------------
11618 6b5yfy1g9k65u 0 ACTIVE WAITING library cache: mutex X 501 407 2087 idn= value= where= 124
0x00000000AE08CC2D 0x0000082700000000
SQL> @hex AE08CC2D
DEC HEX
----------------------------------- --------------------
2919812141.000000 AE08CC2D
SQL> select sql_text from v$sqlarea where hash_value='2919812141';
SQL_TEXT
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),:"SYS_B_0"), NVL(SUM(C2),:"SYS_B_1") FROM (SELECT /*+ IGNORE_WHERE_CLAUSE NO_PARALLEL("BD_RECV_SPBIZIINFO_MID") FULL("BD_RECV_SPBIZIINFO_MID") NO_PARALLEL_INDEX("BD_RECV_SPBIZIINFO_MID") */ :"SYS_B_2" AS C1, CASE
WHEN "BD_RECV_SPBIZIINFO_MID"."OID"=:"SYS_B_3" THEN :"SYS_B_4" ELSE :"SYS_B_5" END AS C2 FROM "ANBOB"."BD_RECV_SPBIZIINFO_MID" SAMPLE BLOCK (:"SYS_B_6" , :"SYS_B_7") SEED (:"SYS_B_8") "BD_RECV_SPBIZIINFO_MID") SAMPLESUB
SQL> @kill sid=2087
COMMANDS_TO_VERIFY_AND_RUN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
alter system kill session '2087,50601' immediate -- ANBOB@kinjk1 (bdbudataproc@kinjk1 (TNS V1-V3));
SQL> alter system kill session '2087,50601' immediate ;
alter system kill session '2087,50601' immediate
*
ERROR at line 1:
ORA-00031: session marked for kill
SQL> @s 2087
SID SQLID_AND_CHILD STATUS STATE EVENT SEQ# SEC_IN_WAIT BLOCKING_SID P1 P2 P3 P1TRANSL
------- -------------------- -------- ------- ---------------------------------------- ---------- ----------- ------------ ------------------ ------------------ ------------------ ------------------------------------------
2087 4ncfrqavn7143 0 KILLED WORKING On CPU / runqueue 36248 20431 NOT IN WAIT
SQL> @sqlid 4ncfrqavn7143 %
Show SQL text, child cursors and execution stats for SQLID 4ncfrqavn7143 child %
HASH_VALUE PLAN_HASH_VALUE CH# SQL_TEXT
---------- --------------- ---- ------------------------------------------------------------------------------------------------------------------------------------------------------
3074655363 1897512577 0 UPDATE BD_RECV_SPBIZIINFO_MID SET STATUS= 'bd_haddeply' WHERE 1 = 1 AND OID = '20180926022453000000003760758109'
SQL> @usid 2087
USERNAME SID AUDSID OSUSER MACHINE PROGRAM SPID OPID CPID SQL_ID HASH_VALUE LASTCALL STATUS SADDR PADDR TADDR LOGON_TIME
----------------------- -------------- ----------- ---------------- ------------------ -------------------- -------------- ------ ------------------------ --------------- ----------- ---------- -------- ---------------- ---------------- ---------------- -----------------
ANBOB '2087,50601' 251550704 taskmon kinjk1 (TNS V1-V3) 29148 9927 10748596 4ncfrqavn7143 3074655363 20489 KILLED C00000163428A960 C000001653BF6028 C000001550CE7100 20180921 08:14:02
SQL> host
oracle@anbob1:/home/oracle> kill -9 29148
可以利用ASH信息分析
SQL> select event, p1, count(1)
from v$active_session_history
where sample_time > (sysdate - 2/24) and event = 'library cache: mutex X' group by event, p1 order by 3;
EVENT P1 COUNT(1)
-------------------- ---------- ----------
...
...
library cache: mutex 3297566623 20
library cache: mutex 3638957046 21
library cache: mutex 2971900843 22
library cache: mutex 383630782 28
library cache: mutex 2399381619 31
library cache: mutex 2947512482 42
library cache: mutex 2919812141 4877
library cache: mutex 52269 240700
Tips:
For library cache: mutex X
1. P1 represents library cache hash bucket number (if idn <= 131072)
2. P1 represents hash value of the library cache object under protection (if idn > 131072)
SQL> select count(*) from x$kglob where KGLNAHSH=2919812141;
COUNT(*)
----------
129761
-- select * from x$kglob where KGLNAHSH=2919812141;
SQL>select owner,name,namespace,status,timestamp,type,loads from V$DB_OBJECT_CACHE where rownum<=10 and HASH_VALUE=2919812141;
OWNER
----------------------------------------------------------------
NAME
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
NAMESPACE STATUS TIMESTAMP TYPE LOADS
---------------------------------------------------------------- ------------------- ------------------- ---------------------------------------------------------------- ----------
SELECT /* OPT_DYN_SAMP */ /*+ ALL_ROWS IGNORE_WHERE_CLAUSE NO_PARALLEL(SAMPLESUB) opt_param('parallel_execution_enabled', 'false') NO_PARALLEL_INDEX(SAMPLESUB) NO_SQL_TUNE */ NVL(SUM(C1),:"SYS_B_0"), NVL(SUM(C2),:"SYS_B_1") FROM (SELECT /*+ IGNORE_WHERE_CLAUSE NO_PARALLEL("BD_RECV_SPBIZIINFO_MID") FULL("BD_RECV_SPBIZIINFO_MID") NO_PARALLEL_INDEX("BD_RECV_SPBIZIINFO_MID") */ :"SYS_B_2" AS C1, CASE
WHEN "BD_RECV_SPBIZIINFO_MID"."OID"=:"SYS_B_3" THEN :"SYS_B_4" ELSE :"SYS_B_5" END AS C2 FROM "ANBOB"."BD_RECV_SPBIZIINFO_MID" SAMPLE BLOCK (:"SYS_B_6" , :"SYS_B_7") SEED (:"SYS_B_8") "BD_RECV_SPBIZIINFO_MID") SAMPLESUB
SQL AREA UNKOWN 2018-09-12/06:50:35 CURSOR
SQL> @tab %.BD_RECV_SPBIZIINFO_MID
Show tables matching condition "%%.BD_RECV_SPBIZIINFO_MID%" (if schema is not specified then current user s tables only are shown)...
OWNER TABLE_NAME TYPE NUM_ROWS BLOCKS EMPTY AVGSPC ROWLEN TAB_LAST_ANALYZED DEGREE COMPRESS
-------------------- ------------------------------ ---- ------------ ------------- --------- ------ ------ ----------------- ---------- --------
ANBOB BD_RECV_SPBIZIINFO_MID PTAB
Tips:
当前library cache中有约13万的obj handle都是关于BD_RECV_SPBIZIINFO_MID表的动态采样,这点有点异常,当前表无统计信息, libarary cache #bucket 52269 产生大量争用,猜测与动态采样有关, 在MOS中没有找到明确BUG。
有几个动态采样相关的BUG
Bug 17279311 - "library cache: mutex X" contention on bucket mutex (文档 ID 17279311.8)
Severe library cache mutex contention may be seen on a library cache bucket
mutex which contains a "hot" handle.
Rediscovery Notes
If you see bucket mutex contention on a bucket which contains a hot handle
when all sessions are trying to get a copy then it could be due to this bug.
The call stack of waiting processes is likely to show both kglGetHot() and
kglGetBucketMutex() functions.
Workaround
None
Bug 19130972 Long running Dynamic Sampling Query / Slow Parsing
A query may take long to parse because the optimizer runs a recursive Dynamic Sampling query that runs for more than 3 seconds.
This Parse doesn't respond to Ctrl C
Rediscovery Information:
The recursive dynamic sampling does not have a SAMPLE keyword.
Workaround
optimizer_dynamic_sampling=0
Bug 10145558 Selects on library cache V$/X$ views cause "library cache: mutex X" waits
"library cache: mutex X" waits may be seen in sessions
when a concurrently executing session is querying V$SQL
/ V$SQLAREA (or various other library cache based views).
This fix changes the behaviour of queries against the library
cache to try to stop them from holding mutexes for long
periods of time as they scan entries in the cache thereby
avoiding waits in concurrent sessions.
Bug 7241819 Slow Parsing / Dynamic Sampling not doing sample
A query could take long to parse because a recursive dynamic sampling is
doing an index full scan on a global index.
The query has the SAMPLE keyword but the execution plan does not show SAMPLE
is being used.
避免方法收集表的统计信息
SQL> exec DBMS_STATS.GATHER_TABLE_STATS (ownname => '&TOWN' , tabname => '&TNAME' , cascade => true, estimate_percent => dbms_stats.auto_sample_size,method_opt=>'FOR TABLE FOR ALL COLUMNS SIZE REPEAT', degree => 8,no_invalidate=>false);
Enter value for town: ANBOB
Enter value for tname: BD_RECV_SPBIZIINFO_MID
PL/SQL procedure successfully completed.
扩展
上次遇到一次sequence调用时遇到library cache: mutex x, 有时是可以使用dbms_shared_pool.markhot缓解这个问题
SQL> Select * from (
Select case when (kglhdadr = kglhdpar) then 'Parent' else 'Child '||kglobt09 end cursor,
kglhdadr ADDRESS,substr(kglnaobj,1,20) name, kglnahsh hash_value,kglobtyd type,kglobt23 LOCKED_TOTAL,kglobt24 PINNED_TOTAL,kglhdexc EXECUTIONS,kglhdnsp NAMESPACE
from x$kglob
order by kglobt24 desc)
where rownum <= 10
/
CURSOR ADDRESS NAME HASH_VALUE TYPE LOCKED_TOTAL PINNED_TOTAL EXECUTIONS NAMESPACE
------------- ---------------- -------------------- ---------- -------------------- ------------ ------------ ---------- ----------
Parent 0000002095792408 SEQ_OID 770228762 SEQUENCE 1823 1801069383 0 1
Parent 000000011F3E9A20 PKG_REPLX 740144107 PACKAGE 540198 1385226874 0 1
Parent 000000205E665608 PKG_REPLX 3844095428 PACKAGE BODY 234253 1384919727 1384919727 2
Child 1 000000203EBD8798 insert into replica 1124387444 CURSOR 378 1363290062 22477809 0
Child 2 00000020B82CBB00 DELETE FROM REPLICAT 3281854519 CURSOR 1813 509619459 22472971 0
Child 0 0000002079731A38 SELECT C.* FROM TBCS 3103436948 CURSOR 90 474128646 440741523 0
Child 0 00000020BEBDC5C8 DELETE FROM REPLICAT 3281854519 CURSOR 2454 449207192 9125910 0
Child 1 00000020BEA33C90 DELETE FROM REPLICAT 3281854519 CURSOR 2289 401501522 23528 0
Child 0 00000020BAF85C60 update LOC_NGSETTLE 31375876 CURSOR 1831 340093349 24385129 0
Child 0 000000011E67A5C0 select A.OID RULEOI 2322592106 CURSOR 1733 314678213 290555709 0
SQL> @o ANBOB.seq_oid
owner object_name object_type status OID D_OID CREATED LAST_DDL_TIME
------------------ ------------------------------ ------------------ --------- ---------- ---------- ----------------- -----------------
ANBOB SEQ_OID SEQUENCE VALID 604160 20180628 11:04:51 20180827 15:47:21
SQL> exec dbms_shared_pool.markhot(schema=>'NGSETTLE',OBJNAME=>'SQL_OID',NAMESPACE=>1);
PL/SQL procedure successfully completed.
打赏
微信扫一扫,打赏作者吧~
对不起,这篇文章暂时关闭评论。