首页 » Cloud, ORACLE 9i-23ai » Troubleshooting select ‘hang’/spin on 12C Active DataGuard

Troubleshooting select ‘hang’/spin on 12C Active DataGuard

今天遇到一套oracle 12c RAC数据库,查询一个分区表的某一个分区,其实就1条记录,段大小也就4M,在primary查询秒出,但是在standby 查询十几分钟不出结果,注意性能问题hang和spin是两个类型,卡住不动是hang,死循环是spin, 通过v$session.seq也可以判断,当然也可以从call stack判断,下面记录一下这个问题。

SQL> select ACCOUNTID FROM "USR1"."TABLE_XX";
-- no feedback

SQL> @ase

USERNAME        SID EVENT                MACHINE    MODULE               STATUS          et SQL_ID          WAI_SECINW ROW_WAIT_OBJ# SQLTEXT                        BS          CH# OSUSER        SEQ# HEX
----------- ------- -------------------- ---------- -------------------- -------- --------- --------------- ---------- ------------- ------------------------------ ---------- ---- ---------- ------- ---------
PUBLIC        10645 Data Guard server op qdyye1     oracle               ACTIVE           0                 0:0                   -1                                :               oracle        3947
PUBLIC        25042 Data Guard server op qdyye2     oracle               ACTIVE           0                 0:0                   -1                                :               oracle       49019
SYS           13777 On CPU / runqueue    anbob1     sqlplus              ACTIVE          63 7jh8bdq9an4fg   -1:63                 -1 select ACCOUNTID FROM "USR1"." :             0 oracle         312   1000001


SQL> @snapper all 10 1 13777
Sampling SID 13777 with interval 10 seconds, taking 1 snapshots...

-- Session Snapper v4.14 BETA - by Tanel Poder ( http://blog.tanelpoder.com ) - Enjoy the Most Advanced Oracle Troubleshooting Script on the Planet! :)

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 SID @INST, USERNAME  , TYPE, STATISTIC                                                 ,         DELTA, HDELTA/SEC,    %TIME, GRAPH       , NUM_WAITS,  WAITS/SEC,   AVERAGES
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 13777  @1, SYS       , STAT, session logical reads                                     ,       4877155,    366.35k,         ,             ,          ,           ,       6.5M total buffer visits
 13777  @1, SYS       , STAT, consistent gets                                           ,       4877128,    366.35k,         ,             ,          ,           ,          ~ per execution
 13777  @1, SYS       , STAT, consistent gets from cache                                ,       4877128,    366.35k,         ,             ,          ,           ,          ~ per execution
 13777  @1, SYS       , STAT, consistent gets pin                                       ,       3251418,    244.23k,         ,             ,          ,           ,          ~ per execution
 13777  @1, SYS       , STAT, consistent gets examination                               ,       1625708,    122.12k,         ,             ,          ,           ,          ~ per execution
 13777  @1, SYS       , STAT, logical read bytes from cache                             ,   39953399808,         3G,         ,             ,          ,           ,          ~ per execution
 13777  @1, SYS       , STAT, shared hash latch upgrades - no wait                      ,       1625721,    122.12k,         ,             ,          ,           ,          ~ per execution
 13777  @1, SYS       , STAT, calls to kcmgcs                                           ,       3251438,    244.24k,         ,             ,          ,           ,          ~ per execution
 13777  @1, SYS       , STAT, buffer is pinned count                                    ,       1624527,    122.03k,         ,             ,          ,           ,      24.99 % buffer gets avoided thanks to buffer pin caching

--  End of Stats snap 1, end=2021-08-10 16:35:32, seconds=13.3


----------------------------------------------------------------------------------------------------
Active% | INST | SQL_ID          | SQL_CHILD | EVENT                               | WAIT_CLASS
----------------------------------------------------------------------------------------------------
   100% |    1 | 7jh8bdq9an4fg   | 0         | ON CPU                              | ON CPU

--  End of ASH snap 1, end=2021-08-10 16:35:32, seconds=10, samples_taken=87

PL/SQL procedure successfully completed.
 
SQL> @seg USR1.TABLE_XX

    SEG_MB OWNER                SEGMENT_NAME                   SEG_PART_NAME                  SEGMENT_TYPE         SEG_TABLESPACE_NAME                BLOCKS     HDRFIL     HDRBLK
---------- -------------------- ------------------------------ ------------------------------ -------------------- ------------------------------ ---------- ---------- ----------
...
         4 USR1                 TABLE_XX                      PART_319                       TABLE PARTITION      DATA319                               512         15     308873
         4 USR1                 TABLE_XX                      PART_335                       TABLE PARTITION      DATA335                               512         16     308873
         4 USR1                 TABLE_XX                      PART_336                       TABLE PARTITION      DATA336                               512        987     317065

执行计划是Full Table SCAN.

SQL> oradebug setorapid 262
Oracle pid: 262, Unix process pid: 244, image: oracle@anbob1 (TNS V1-V3)
SQL> oradebug short_stack;
ksedsts()+592<-ksdxfstk()+64<-ksdxcb()+2000<-sspuser()+640<-<-kcbgtcr()+37169<-kcbgcurstdby()+160<-$cold_kcbgcur()+624<-ktuGetUsegHdr()+2160<-ktuGetExtTxnInfo()+2192<-ktugti()+96<-$cold_ktuxcm()+1072<-ktspgfb_adg()+3936<-$cold_ktspGetNextL1ForScan()+1312<-ktspGenExtentMap1()+672<-kteinmap0()+4880<-kteinmap1()+48<-kteinmap()+48<-kdselini()+128<-kdsirsf()+7200<-kdsirs_fco()+352<-qertbFetch()+7744<-qergiFetch()+1168<-opifch2()+7648<-opifch()+112<-opiodr()+2176<-ttcpip()+1760<-opitsk()+3936<-opiino()+1712<-opiodr()+2176<-opidrv()+1824<-sou2o()+288<-opimai_real()+608<-ssthrdmain()+864<-main()+336<-main_opd_entry()+80 SQL> oradebug short_stack;
ksedsts()+592<-ksdxfstk()+64<-ksdxcb()+2000<-sspuser()+640<-<-kteinscb()<-kcbgtcr()+30624<-ktspgfb_adg()+3392<-$cold_ktspGetNextL1ForScan()+1312<-ktspGenExtentMap1()+672<-kteinmap0()+4880<-kteinmap1()+48<-kteinmap()+48<-kdselini()+128<-kdsirsf()+7200<-kdsirs_fco()+352<-qertbFetch()+7744<-qergiFetch()+1168<-opifch2()+7648<-opifch()+112<-opiodr()+2176<-ttcpip()+1760<-opitsk()+3936<-opiino()+1712<-opiodr()+2176<-opidrv()+1824<-sou2o()+288<-opimai_real()+608<-ssthrdmain()+864<-main()+336<-main_opd_entry()+80 SQL>  oradebug short_stack;
ksedsts()+592<-ksdxfstk()+64<-ksdxcb()+2000<-sspuser()+640<-<-kcbgtcr()+33169<-ktspgfb_adg()+3392<-$cold_ktspGetNextL1ForScan()+1312<-ktspGenExtentMap1()+672<-kteinmap0()+4880<-kteinmap1()+48<-kteinmap()+48<-kdselini()+128<-kdsirsf()+7200<-kdsirs_fco()+352<-qertbFetch()+7744<-qergiFetch()+1168<-opifch2()+7648<-opifch()+112<-opiodr()+2176<-ttcpip()+1760<-opitsk()+3936<-opiino()+1712<-opiodr()+2176<-opidrv()+1824<-sou2o()+288<-opimai_real()+608<-ssthrdmain()+864<-main()+336<-main_opd_entry()+80 SQL>  oradebug short_stack;
ksedsts()+592<-ksdxfstk()+64<-ksdxcb()+2000<-sspuser()+640<-<-kcscu8()<-$cold_kcbgtcr()+60160<-ktspgfb_adg()+2272<-$cold_ktspGetNextL1ForScan()+1312<-ktspGenExtentMap1()+672<-kteinmap0()+4880<-kteinmap1()+48<-kteinmap()+48<-kdselini()+128<-kdsirsf()+7200<-kdsirs_fco()+352<-qertbFetch()+7744<-qergiFetch()+1168<-opifch2()+7648<-opifch()+112<-opiodr()+2176<-ttcpip()+1760<-opitsk()+3936<-opiino()+1712<-opiodr()+2176<-opidrv()+1824<-sou2o()+288<-opimai_real()+608<-ssthrdmain()+864<-main()+336<-main_opd_entry()+80 SQL>  oradebug short_stack;
ksedsts()+592<-ksdxfstk()+64<-ksdxcb()+2000<-sspuser()+640<-<-$cold_kcbgcur()+624<-ktuGetUsegHdr()+2160<-ktuGetExtTxnInfo()+2192<-ktugti()+96<-$cold_ktuxcm()+1072<-ktspgfb_adg()+3936<-$cold_ktspGetNextL1ForScan()+1312<-ktspGenExtentMap1()+672<-kteinmap0()+4880<-kteinmap1()+48<-kteinmap()+48<-kdselini()+128<-kdsirsf()+7200<-kdsirs_fco()+352<-qertbFetch()+7744<-qergiFetch()+1168<-opifch2()+7648<-opifch()+112<-opiodr()+2176<-ttcpip()+1760<-opitsk()+3936<-opiino()+1712<-opiodr()+2176<-opidrv()+1824<-sou2o()+288<-opimai_real()+608<-ssthrdmain()+864<-main()+336<-main_opd_entry()+80 SQL> 

Parsed short stack:
                                                    (?) 
kteinscb()                                         kernel transaction extent management information operations  [partial hit for: ktein ] 
kcbgtcr()+30624                                    kernel cache buffers get consistent read 
ktspgfb_adg()+3392<ktspGetNextL1ForScan()+1312     kernel transaction segment management segment pagetable  [partial hit for: ktsp ] 
ktspGenExtentMap1()+672                            kernel transaction segment management segment pagetable  [partial hit for: ktsp ] 
kteinmap0()+4880                                   kernel transaction extent management information operations return extent map  [partial hit for: kteinmap ] 
kteinmap1()+48                                     kernel transaction extent management information operations return extent map  [partial hit for: kteinmap ] 
kteinmap()+48                                      kernel transaction extent management information operations return extent map 
kdselini()+128                                     kernel data seek/scan extent list initialize 
kdsirsf()+7200                                     kernel data seek/scan  [partial hit for: kds ] 
kdsirs_fco()+352                                   kernel data seek/scan  [partial hit for: kds ] 
qertbFetch()+7744                                  query execute rowsource table access fetch 


 select region FROM "USR1"."TABLE_XX" partition(part_336);

    REGION
----------
       336
SQL> @x
Display execution plan for last statement for this session from library cache...

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  5yuc1gy6u09vf, child number 0
-------------------------------------
 select region FROM "USR1"."TABLE_XX" partition(part_336)

Plan hash value: 798915741

------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name                  | E-Rows |E-Bytes| Cost (%CPU)| Pstart| Pstop |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                       |        |       |     1 (100)|       |       |
|   1 |  PARTITION RANGE SINGLE|                       |      1 |     4 |     1   (0)|    12 |    12 |
|   2 |   INDEX FULL SCAN      | IDX_UNI_TABLE_XX_OID |      1 |     4 |     1   (0)|    12 |    12 |
------------------------------------------------------------------------------------------------------

Note:
使用索引扫描可以很快出结果,但是使用表扫描一直处于spin,  表段很小4m, 但是查询每秒有366k的consistent gets ,从call stack看似乎一直在读取segmeng L1获取extent Map,在MOS中去查询比较匹配Bug 30325407

原因:

spin是由 L1BMB 块的意外状态引起的。要清除此状态,使用dbms_space.space_usage指定段分区即可,通常这个package也用于查看段碎片。

打赏

对不起,这篇文章暂时关闭评论。