Troubleshooting “latch: row cache objects” case and Event 10089 to do.
“row cache objects”是序列化latch,用于保护对SGA中dictionary cache的访问。 只要是参考dictionary cache中的元数据对象,就会获取此latch。 Row Cache Objects Latch 是 shared pool latch 相关,目的是为访问Oracle中缓存的 data dictionary area的用户进程提供数据保护。
这里记录一个11.2.0.3 RAC环境中的案例, 几年前写过一篇分析脚本《Tuning ‘latch: row cache objects’ Event》
SQL> @sw "select sid from v$session where event='latch: row cache objects'" SID STATE EVENT SEQ# SEC_IN_WAIT P1 P2 P3 P1TRANSL ------- ------- ---------------------------------------- ---------- ----------- ------------------- ------------------- ------------------- ------------------------------------------ 557 WAITING latch: row cache objects 47722 0 address= number= 280 tries= 0 0xC0000015B3ABB538: row cache objects[c8] 0xC0000015B3ABB538 16325 WAITING latch: row cache objects 3555 0 address= number= 280 tries= 0 0xC0000015B3ABB538: row cache objects[c8] 0xC0000015B3ABB538 948 WAITING latch: row cache objects 5190 0 address= number= 280 tries= 0 0xC0000015B3ABB538: row cache objects[c8] 0xC0000015B3ABB538 1363 WAITING latch: row cache objects 12934 0 address= number= 280 tries= 0 0xC0000015B3ABB538: row cache objects[c8] 0xC0000015B3ABB538 1622 WAITING latch: row cache objects 27639 0 address= number= 280 tries= 0 0xC0000015B3ABB538: row cache objects[c8] 0xC0000015B3ABB538 717 WAITING latch: row cache objects 64778 0 address= number= 280 tries= 0 0xC0000015B3ABB538: row cache objects[c8] 0xC0000015B3ABB538 SQL> select * from v$latch_children where addr='C0000015B3ABB538'; ADDR LATCH# CHILD# LEVEL# NAME HASH GETS MISSES SLEEPS IMMEDIATE_GETS IMMEDIATE_MISSES WAITERS_WOKEN WAITS_HOLDING_LATCH SPIN_GETS SLEEP1 SLEEP2 SLEEP3 SLEEP4 SLEEP5 SLEEP6 SLEEP7 SLEEP8 SLEEP9 SLEEP10 SLEEP11 WAIT_TIME ---------------- ---------- ---------- ---------- ---------------------------------------------------------------- ---------- ---------- ---------- ---------- -------------- ---------------- ------------- ------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- C0000015B3ABB538 280 8 4 row cache objects 2412510220 2086493595 593790459 155653120 47042 3177 0 0 451293450 0 0 0 0 0 0 0 0 0 0 0 6.1216E+12 SQL> select * from x$kqrst where kqrstcln=8; ADDR INDX INST_ID KQRSTCID KQRSTSNO KQRSTTYP KQRSTTXT KQRSTCSZ KQRSTOSZ KQRSTFCS KQRSTUSG KQRSTGRQ KQRSTGMI KQRSTSRQ KQRSTSMI KQRSTSCO KQRSTMRQ KQRSTMFL Child# KQRSTILR KQRSTIFR KQRSTISR ---------------- ---------- ---------- ---------- ---------- ---------- -------------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ------ ---------- ---------- ---------- C0000015B3ABA1C0 7 1 10 0 1 dc_users 101 0 0 101 4153788437 49022 0 0 0 413 413 8 48800 7 47468 C0000015B3ABAD00 52 1 7 0 2 dc_users 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 C0000015B3ABAE40 57 1 7 1 2 dc_users 50 0 0 50 3329238869 14418 0 0 0 0 0 8 0 0 0 C0000015B3ABAE80 58 1 7 2 2 dc_users 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 SQL> oradebug setorapid 1601 Oracle pid: 1601, Unix process pid: 29755, image: oracle@qdyyc1 SQL> oradebug short_stack; SQL> oradebug setorapid 1601 Oracle pid: 1601, Unix process pid: 29755, image: oracle@qdyyc1 SQL> oradebug short_stack; ksedsts()+544<-ksdxfstk()+48<-ksdxcb()+3216<-sspuser()+688<-<kernel><-_pw_wait()+48<-pw_wait()+112<-sskgpwwait()+432<-skgpwwait()+320<-kslges()+2208<-kslgetl()+784<-kqreqd()+448<-kqrprl()+384<-ktatminextsz()+1024<-qerhjComputeFanoutAndBPS()+480<-kkejnc()+4176<-kkojnp()+19488<-kkocnp()+96<-kkooqb()+3120<-kkoqbc()+4304<-apakkoqb()+384<-apaqbdDescendents()+896<-apaqbdList()+160<-apaqbdDescendents()+1088<-apaqbdList()+160<-apaqbdDescendents()+496<-apaqbdList()+160<-apaqbdDescendents()+496<-apaqbdList()+160<-apaqbdDescendents()+496<-apaqbd()+128<-apadrv()+1664<-opitca()+4832<-kksSetBindType()+13504<-kksfbc()+11872<-opiexe()+5104<-kpoal8()+4624<-opiodr()+2416<-ttcpip()+2416<-opitsk()+2960<-opiino()+1696<-opiodr()+2416<-opidrv()+1616<-sou2o()+256<-opimai_real()+656<-ssthrdmain()+576<-main()+336<-main_opd_entry()+80 SQL> oradebug short_stack; ksedsts()+544<-ksdxfstk()+48<-ksdxcb()+3216<-sspuser()+688<-<kernel><-kcbgtcr()+4145<-ktrget3()+1632<-ktrget2()+160<-kdst_fetch()+1344<-kdstf11001010000km()+1600<-kdsttgr()+3344<-qertbFetch()+1520<-rwsfcd()+320<-qerhjFetch()+1040<-rwsfcd()+320<-qerhjFetch()+1616<-rwsfcd()+320<-qeruaFetch()+848<-qervwFetch()+352<-qersoProcessULS()+496<-qersoFetch()+896<-qervwFetch()+352<-qergsFetch()+4112<-opifch2()+9600<-kpoal8()+7360<-opiodr()+2416<-ttcpip()+2416<-opitsk()+2960<-opiino()+1696<-opiodr()+2416<-opidrv()+1616<-sou2o()+256<-opimai_real()+656<-ssthrdmain()+576<-main()+336<-main_opd_entry()+80 SQL> oradebug short_stack; ksedsts()+544<-ksdxfstk()+48<-ksdxcb()+3216<-sspuser()+688<-<kernel><-_read_sys()+48<-_read()+224<-$cold_skgfqio()+880<-ksfd_skgfqio()+208<-ksfd_io()+1168<-ksfdread()+304<-kcfrbd1()+1248<-kcbzib()+2464<-kcbgtcr()+14336<-ktrget3()+1632<-ktrget2()+160<-kdsgrp()+336<-qetlbr()+432<-qertbFetchByRowID()+1728<-qerjotFetch()+1504<-qerjotFetch()+1504<-qervwFetch()+352<-rwsfcd()+320<-qerhjFetch()+2080<-rwsfcd()+320<-qeruaFetch()+848<-$cold_qervwFetch()+2384<-qersoProcessULS()+496<-qersoFetch()+896<-qervwFetch()+352<-qergsFetch()+4112<-opifch2()+9600<-kpoal8()+7360<-opiodr()+2416<-ttcpip()+2416<-opitsk()+2960<-opiino()+1696<-opiodr()+2416<-opidrv()+1616<-sou2o()+256<-opimai_real()+656<-ssthrdmain()+576<-main()+336<-main_opd_entry()+80 SQL> oradebug short_stack; ksedsts()+544<-ksdxfstk()+48<-ksdxcb()+3216<-sspuser()+688<-<kernel><-kkoiop()+96<-kkoiop()+304<-kkobrfio()+368<-kkoij2()+1360<-kkoixj()+704<-kkotap()+11120<-kkoiqb()+16704<-kkooqb()+1104<-kkoqbc()+4304<-apakkoqb()+384<-apaqbdDescendents()+896<-apaqbd()+128<-kkqctCostTransfQB()+608<-kkqctdrvJPPD()+2288<-kkqjpdctr()+2736<-kkqctInterleaveJPPD()+368<-kkqctdrvSU()+3920<-kkqutruns()+736<-kkqutruns()+208<-kkqutruns()+208<-kkqutruns()+208<-kkqutruns()+432<-kkqutruns()+240<-kkqutruns()+240<-kkqudrv()+784<-kkqctdrvTD()+2240<-kkqdrv()+5360<-kkqctdrvIT()+1424<-apadrv()+3376<-opitca()+4832<-kksFullTypeCheck()+144<-rpiswu2()+1120<-kksSetBindType()+4640<-kksfbc()+11872<-opiexe()+5104<-kpoal8()+4624<-opiodr()+2416<-ttcpip()+2416<-opitsk()+2960<-opiino()+1696<-opiodr()+2416<-opidrv()+1616<-sou2o()+256<-opimai_real()+656<-ssthrdmain()+576<-main()+336<-main_opd_entry()+80 SQL> oradebug short_stack; ksedsts()+544<-ksdxfstk()+48<-ksdxcb()+3216<-sspuser()+688<-<kernel><-_write_sys()+48<-write()+176<-$cold_skgfqio()+784<-ksfd_skgfqio()+208<-ksfd_io()+1168<-ksfdread()+304<-kcfrbd1()+1248<-kcbzib()+2464<-kcbgtcr()+14336<-ktrget3()+1632<-ktrget2()+160<-kdsgrp()+336<-qetlbr()+432<-qertbFetchByRowID()+1728<-qerjotFetch()+3664<-qerjotFetch()+3664<-qervwFetch()+352<-rwsfcd()+320<-qerhjFetch()+2080<-rwsfcd()+320<-qeruaFetch()+848<-qervwFetch()+352<-qersoProcessULS()+496<-qersoFetch()+896<-qervwFetch()+352<-qergsFetch()+4112<-opifch2()+9600<-kpoal8()+7360<-opiodr()+2416<-ttcpip()+2416<-opitsk()+2960<-opiino()+1696<-opiodr()+2416<-opidrv()+1616<-sou2o()+256<-opimai_real()+656<-ssthrdmain()+576<-main()+336<-main_opd_entry()+80 SQL> col cache# head "Cache|no" form 999 SQL> col parameter head "Parameter" form a25 SQL> col type head "Type" form a12 SQL> col subordinate# head "Sub|ordi|nate" form 9999 SQL> col rcgets head "Cache|Gets" form 999999999999 SQL> col rcmisses head "Cache|Misses" form 999999999999 SQL> col rcmodifications head "Cache|Modifica|tions" form 999999999999 SQL> col rcflushes head "Cache|Flushes" form 999999999999 SQL> col kqrstcln head "Child#" form 999 SQL> col lagets head "Latch|Gets" form 999999999999 SQL> col lamisses head "Latch|Misses" form 999999999999 SQL> col laimge head "Latch|Immediate|gets" form 999999999999 SQL> SQL> select dc.kqrstcid CACHE#, dc.kqrsttxt PARAMETER, decode(dc.kqrsttyp, 1,'PARENT','SUBORDINATE') type, 2 decode(dc.kqrsttyp, 2, kqrstsno, null) subordinate#, 3 dc.kqrstgrq rcgets, dc.kqrstgmi rcmisses, dc.kqrstmrq rcmodifications, dc.kqrstmfl rcflushes, dc.kqrstcln, 4 la.gets lagets, la.misses lamisses, la.immediate_gets laimge 5 from x$kqrst dc, 6 v$latch_children la 7 where dc.inst_id = userenv('instance') 8 and la.child# = dc.kqrstcln 9 and la.name = 'row cache objects' 10 order by rcgets desc; Sub Cache Latch Cache ordi Cache Cache Modifica Cache Latch Latch Immediate no Parameter Type nate Gets Misses tions Flushes Child# Gets Misses gets ----- ------------------------- ------------ ----- ------------- ------------- ------------- ------------- ------ ------------- ------------- ------------- 10 dc_users PARENT 3734912072 48997 413 413 8 819920633 250124586 47007 8 dc_objects PARENT 3620833881 130864169 2454725 2441251 9 2558015670 2412030590 3164403 7 dc_users SUBORDINATE 1 3314778444 14412 0 0 8 819920633 250124586 47007 2 dc_segments PARENT 2419992098 208986999 5565215 5565144 4 409661609 261272169 14687358 16 dc_histogram_defs PARENT 1339720402 51147414 1146382 1123333 13 956389160 3386779019 8520219 0 dc_tablespaces PARENT 1152376656 14361 0 0 5 3457243305 3385062240 16133 3 dc_rollback_segments PARENT 566714092 2910 1509 1508 1 1492039470 1286974917 85 16 dc_histogram_data SUBORDINATE 0 503513763 946170 3141 2354 13 956389160 3386779019 8520219 14 dc_profiles PARENT 358817542 795 1 1 18 1076461125 4088737 1358 16 dc_histogram_data SUBORDINATE 1 289546279 296203 3031 2044 13 956389160 3386779019 8520219 6 dc_files PARENT 214370418 148014 15 15 7 644323538 320328 12272 SQL> select * from v$latch_misses where parent_name='row cache objects' and WTR_SLP_COUNT>1 order by 4; PARENT_NAME WHERE NWFAIL_COUNT SLEEP_COUNT WTR_SLP_COUNT LONGHOLD_COUNT LOCATION -------------------- -------------------------------------------------------------------------------- ------------ ----------- ------------- -------------- -------------------------------------------------------------------------------- .... row cache objects kqrigt 0 33041 13374 0 kqrigt row cache objects kqrbip 0 60479 26413 0 kqrbip row cache objects kqrso 0 1143654 1491449 0 kqrso row cache objects kqreqd 0 35450260 71020564 0 kqreqd row cache objects kqrpre: find obj 0 76579666 129201124 0 kqrpre: find obj row cache objects kqreqd: reget 0 120729099 32281153 0 kqreqd: reget
# AWR
Load Profile
Per Second | Per Transaction | Per Exec | Per Call | |
---|---|---|---|---|
DB Time(s): | 177.0 | 0.3 | 0.01 | 0.00 |
DB CPU(s): | 59.9 | 0.1 | 0.00 | 0.00 |
Redo size: | 2,367,829.5 | 3,705.1 | ||
Logical reads: | 2,942,702.4 | 4,604.6 | ||
Block changes: | 13,108.5 | 20.5 | ||
Physical reads: | 34,802.0 | 54.5 | ||
Physical writes: | 709.6 | 1.1 | ||
User calls: | 54,072.2 | 84.6 | ||
Parses: | 1,338.6 | 2.1 | ||
Hard parses: | 217.0 | 0.3 | ||
W/A MB processed: | 135.0 | 0.2 | ||
Logons: | 9.4 | 0.0 | ||
Executes: | 26,684.2 | 41.8 | ||
Rollbacks: | 260.7 | 0.4 | ||
Transactions: | 639.1 |
Top 5 Timed Foreground Events
Event | Waits | Time(s) | Avg wait (ms) | % DB time | Wait Class |
---|---|---|---|---|---|
latch: row cache objects | 892,736 | 112,098 | 126 | 34.94 | Concurrency |
DB CPU | 108,645 | 33.86 | |||
db file sequential read | 24,567,158 | 57,141 | 2 | 17.81 | User I/O |
db file scattered read | 1,797,235 | 8,206 | 5 | 2.56 | User I/O |
gc cr grant 2-way | 11,853,598 | 6,796 | 1 | 2.12 | Cluster |
Dictionary Cache Stats
- “Pct Misses” should be very low (< 2% in most cases)
- “Final Usage” is the number of cache entries being used
Cache | Get Requests | Pct Miss | Scan Reqs | Pct Miss | Mod Reqs | Final Usage |
---|---|---|---|---|---|---|
dc_awr_control | 38 | 0.00 | 0 | 2 | 1 | |
dc_files | 27,488 | 0.00 | 0 | 0 | 859 | |
dc_global_oids | 8,523 | 0.18 | 0 | 0 | 30 | |
dc_histogram_data | 67,542 | 0.98 | 0 | 0 | 820 | |
dc_histogram_defs | 45,560,366 | 0.05 | 0 | 0 | 13,260 | |
dc_object_grants | 1,628 | 3.44 | 0 | 0 | 313 | |
dc_objects | 90,488,082 | 0.02 | 0 | 3 | 5,801 | |
dc_profiles | 31,701 | 0.00 | 0 | 0 | 4 | |
dc_rollback_segments | 591,873 | 0.00 | 0 | 0 | 2,556 | |
dc_segments | 5,818,659 | 1.39 | 0 | 128 | 82,721 | |
dc_sequences | 537 | 56.42 | 0 | 537 | 5 | |
dc_table_scns | 25 | 36.00 | 0 | 5 | 5 | |
dc_tablespaces | 60,356,706 | 0.00 | 0 | 0 | 36 | |
global database name | 17,606 | 0.01 | 0 | 0 | 1 | |
outstanding_alerts | 99 | 100.00 | 0 | 0 | 0 | |
sch_lj_oids | 6 | 0.00 | 0 | 0 | 1 |
Wait Event Histogram
% of Waits | |||||||||
Event | Total Waits | <1ms | <2ms | <4ms | <8ms | <16ms | <32ms | <=1s | >1s |
---|---|---|---|---|---|---|---|---|---|
latch: row cache objects | 894.8K | 23.6 | 2.6 | 1.8 | 1.5 | 1.6 | 2.5 | 66.3 | .0 |
Latch Miss Sources
Latch Name | Where | NoWait Misses | Sleeps | Waiter Sleeps |
---|---|---|---|---|
row cache objects | kqreqd: reget | 0 | 563,517 | 128,828 |
row cache objects | kqrpre: find obj | 0 | 255,237 | 554,836 |
row cache objects | kqreqd | 0 | 102,913 | 236,755 |
row cache objects | kqrso | 0 | 1,092 | 2,518 |
row cache objects | kqrbip | 0 | 88 | 9 |
row cache objects | kqrbgl | 0 | 66 | 46 |
## hanganalyze trace show blocker
and is blocked by
=> Oracle session identified by:
{
instance: 1 (weejar.weejar1)
os id: 25694
process id: 8158, oracle@qdyyc1
session id: 17647
session serial #: 51605
}
which is not in a wait:
{
last wait: 0.000000 sec ago
blocking: 5 sessions
current sql: select * from (select row_.*, rownum rownum_ from ( select distinct m.prodname, m.privname, m.privna, decode((select count(1)
from anbob.****vilege t, anbob.***scriber s where s.****umber = :1
short stack: ksedsts()+544<-ksdxfstk()+48<-ksdxcb()+3216<-sspuser()+688<-<kernel><-_pw_wait()+48<-pw_wait()+112<-sskgpwwait()+432<-skgpwwait()+320<-kslges()+2208<-kslgetl()+784<-kqreqd()+
448<-kqrprl()+384<-ktatminextsz()+1024<-qerhjComputeFanoutAndBPS()+480<-kkejnc()+4176<-kkojnp()+19488<-kkocnp()+96<-kkooqb()+3120<-kkoqbc()+4304<-apakkoqb()+384<-apaqbdDescendents()+896<-apaqbdList()+
160<-apaqbdDescendents()+496<-apaqbd()+128<-kkqctCostTransfQB()+608<-kkqctdrvJPPD()+2288<-kkqjpdctr()+2736<-qksqbApplyToQbcLoc()+864<-qksqbApplyT
wait history:
1. event: ‘latch: row cache objects’
time waited: 0.360363 sec
wait id: 34094 p1: ‘address’=0xc0000015b3abb538
p2: ‘number’=0x118
p3: ‘tries’=0x0
* time between wait #1 and #2: 0.034827 sec
这个案例主要的问题
1. hard parse & soft parse 太高
2. 相关的BUG 有
Bug 13902396 Hash joins cause “row cache objects” latch gets and “shared pool” latch gets
Bug 17799716 High row cache requests for a query with many indexes
不过从上面看当时一个bloker会话正在执行的SQL表上索引是不少(10+), 而且与bug 17799716 call stack有些相似。是在当前版本中当执行计划用不同索引cost相同时,排序索引名称取第一个, 在11.2.0.3[|4]中CBO引入的更多的查询转换所以会持有更长时间的row cache latch比10g时。 解决方法如改SQL降硬解析使用绑定变量;修改优化器为10.2.0.4;使用event 10089 level 1;
What ‘s Event 10089 level 1
Event 10089 level 1: CBO disable index sorting.
默认情况 下CBO 使用 排除决策的方法,当两个执行计划中index 的cost相同时,使用索引名称字母排序选择第一个索引. 使用10053 event 可以看到是当两个执行计划中两个索引COST相同时如何选择索引。默认10089 event没有配置,而是使用索引名称排除的, 但是当启用了event 10089 level 1, CBO就不会在当cost相同时使用index name排序,而是改为按index created 创建时间排序。 这个event 有助于降低”get” row cache dc_objects的竞争。
对不起,这篇文章暂时关闭评论。