首页 » ORACLE 9i-23ai » Troubleshooting 19c ORA-31626: job does not exist, ORA-955 QT_*BUFFER

Troubleshooting 19c ORA-31626: job does not exist, ORA-955 QT_*BUFFER

一客户oracle 19c(19.9) RAC环境在尝试expdp导出时提示ora-31626然后中断,然后找到我,之前已做过一些尝试包括重建了数据库datapump组件(dpload.sql),  确实问题比较隐蔽, 下面简单的记录该问题。

$ expdp '/ as sysdba' directory=xxx schemas=xxx dumpfile =xxx  logfile=xx job_name=exp1

ORA-31626: job does not exist

Note:
log 中并没有其他报错信息。

alert log

2022-07-14T15:30:59.439134+08:00
Thread 1 advanced to log sequence 34554 (LGWR switch),  current SCN: 843425459
  Current log# 3 seq# 34554 mem# 0: /orcldata/anbob/onlinelog/o1_mf_3_j7ldctg2_.log
2022-07-14T15:31:00.954791+08:00
ARC1 (PID:2311): Archived Log entry 62975 added for T-1.S-34553 ID 0x5c4a9838 LAD:1
2022-07-14T15:31:14.647495+08:00
TT00 (PID:2305): Error 1034 received logging on to the standby
2022-07-14T15:31:42.598301+08:00
Errors in file /u01/app/oracle/diag/rdbms/anbob/orcl/trace/orcl_ora_27067.trc:
ORA-31626: job does not exist
2022-07-14T15:31:45.202992+08:00
Errors in file /u01/app/oracle/diag/rdbms/anbob/orcl/trace/orcl_ora_27067.trc:
ORA-31626: job does not exist
2022-07-14T15:32:15.626814+08:00
Errors in file /u01/app/oracle/diag/rdbms/anbob/orcl/trace/orcl_ora_27067.trc:
ORA-31626: job does not exist
ORA-39086: cannot retrieve job information
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79
2022-07-14T15:34:30.167869+08:00
Errors in file /u01/app/oracle/diag/rdbms/anbob/orcl/trace/orcl_ora_31459.trc:
ORA-00955: name is already used by an existing object
2022-07-14T15:34:32.991988+08:00
DM00 started with pid=133, OS id=31464, job SYS.EXP1

当时的ASH信息

SQL> @ase
                                                                                  last_call
USERNAME          SID EVENT                MACHINE    MODULE               STATUS         et       SEQ# SQL_ID          WAI_SECINW ROW_WAIT_OBJ# SQLTEXT                        BS          CH# OSUSER      HEX
---------- ---------- -------------------- ---------- -------------------- ------- --------- ---------- --------------- ---------- ------------- ------------------------------ ---------- ---- ---------- ----------

XXX              3410 On CPU / runqueue    XXX.XXaa JDBC Thin Client     ACTIVE          0      34301 b1anbkr4qwfjy   -1:0                  -1 SELECT * FROM tabx WHERE  :             0 root          10252dd
                                          

SYS              1426 wait for unread mess XXX-db     ude                  ACTIVE         75      30124 afcz9s4uazbpk   0:5                   -1 BEGIN :1 := sys.kupc$que_int.t :             0 oracle        10000e3
SYS              1995 Streams AQ: enqueue  XXX-db     Data Pump Master     ACTIVE        915       1003 8rgw5q94paqsg   0:15              569023 BEGIN sys.kupc$que_int.send(:1 :             0 oracle        1000011

SQL> select event event1 from v$session where sid=1426;

EVENT1
----------------------------------------------------------------
Streams AQ: enqueue blocked on low memory

此问题可能发生在AMM, ASMM 环境中(设置了 sga_target 或 memory_target),当buffer cache中有负载并且stream pool 内存正在移动到buffer cache

SQL> show parameter strea

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
streams_pool_size                    big integer 1G
SQL> show parameter target

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
archive_lag_target                   integer     0
db_big_table_cache_percent_target    string      0
db_flashback_retention_target        integer     1440
fast_start_io_target                 integer     0
fast_start_mttr_target               integer     0
memory_max_target                    big integer 0
memory_target                        big integer 0
parallel_servers_target              integer     128
pga_aggregate_target                 big integer 300M
sga_target                           big integer 28G
target_pdbs                          integer     0
SQL> alter system set streams_pool_size=2048m;

System altered.

SQL> select shrink_phase_knlasg from X$KNLASG;
SHRINK_PHASE_KNLASG
-------------------
1

如果遇到类似的减速,请检查以下查询是否不断返回“1”。该值表示流池处于收缩阶段。当流池完成收缩时,该值应该返回“0”,但如果它一直返回“1”,那么您可能遇到了这个问题, 这个问题不是报错的原因,继续分析报错信息。

启用datapump trace

alter system set events 'sql_trace {process: pname = dw | process: pname = dm} level=12';

配置expdp ... TRACE=480300 

启用 errorstack

SQL> oradebug eventdump system
955 trace name errorstack level 4
31626 trace name errorstack forever,level 3

Trace file

WARNING:1 Oracle process running out of OS kernelI/O resources request_aiolimit=128 set_aiolimit=0    <<<<<<<<<<<<<<<
Trace file /u01/app/oracle/diag/rdbms/anbob/orcl/trace/orcl_ora_28437.trc
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.9.0.0.0
Build label:    RDBMS_19.9.0.0.0DBRU_LINUX.X64_200930
ORACLE_HOME:    /u01/app/oracle/product/19.0.0/db

..

----- Error Stack Dump -----
 at 0x7ffe6f931580 placed dbkda.c@296
ORA-00955: name is already used by an existing object
 at 0x7ffe6f935018 placed cvw.c@1608
----- Current SQL Statement for this session (sql_id=6wksvt0zdd12f) -----
CREATE VIEW SYS.qt566658_BUFFER  AS SELECT  ADDR, INDX, INST_ID, OBJNO, QUEUE_ID, MSGID, CORRID,  SEQUENCE_NUM, MSG_NUM, STATE, PRIORITY, EXPIRATION,  ENQ_TIME, ENQ_UID, ENQ_USER_NAME, RETRY_COUNT, SENDER_NAME,  SENDER_ADDRESS, SENDER_PROTOCOL, DEQUEUE_MSGID,  SRCSEQUENCE_NUM, SUBSCRIBER_ID, EXCEPTIONQ_SCHEMA,  EXCEPTIONQ_NAME  FROM X$BUFFER2 x WHERE objno = 566658
----- PL/SQL Stack -----
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0xceec76d8       302  package body SYS.DBMS_AQADM_SYSCALLS.KWQA_3GL_EXECUTESTMT
0x126586b40       495  package body SYS.DBMS_AQADM_INV.EXECUTE_STMT2
0x15233d0d8      3292  package body SYS.DBMS_AQADM_SYS.CREATE_BUFFER_VIEW_COMMON
0x15233d0d8      3348  package body SYS.DBMS_AQADM_SYS.CREATE_BUFFER_VIEW
0x15233d0d8      8301  package body SYS.DBMS_AQADM_SYS.ADD_BUFFER_TABLES_INT
0x15233d0d8      8222  package body SYS.DBMS_AQADM_SYS.ADD_BUFFER_INT
0x15233d0d8      8071  package body SYS.DBMS_AQADM_SYS.ADD_BUFFER
0x11a2721e8      1546  package body SYS.KUPC$QUE_INT.CREATE_QUEUES
0x105ae3458         1  anonymous block
0x135fcd338       158  package body SYS.KUPC$QUEUE_INT.CREATE_QUEUES
0x17e845018      1002  package body SYS.KUPV$FT_INT.CREATE_NEW_JOB
0x145507b80      1646  package body SYS.KUPV$FT.CREATE_ATTACH_JOB
0x145507b80      1103  package body SYS.KUPV$FT.OPEN_JOB
0x15bc169e0      6809  package body SYS.DBMS_DATAPUMP.OPEN
..

NOTE:
从trace中又可以发现一个小问题WARNING:1 Oracle process running out of OS kernelI/O resources request_aiolimit=128 set_aiolimit=0

对于生产数据库系统,Oracle 建议您调整这些值以优化系统性能。

sysctl -w fs.aio_max_nr=4194304

And the maximum of aio-max-nr parameter can be calculated by aio-max-nr =no of process per DB * no of databases * 4096
you should still monitor the /proc/sys/fs/aio-nr value and also monitor I/O loading.

Points to know –
1) Number of processes per database is not background server process number, it’s the number of background server process + user process.
2) Parameter aio-max-nr is a limitation of aio-nr, no impact was reported if you set it too high. But we should still monitor /proc/sys/fs/aio-nr value and tune this value.
当然这也不是问题的关键, 另一个问题是创建CREATE VIEW SYS.qtxxxxx_BUFFER提示已存在。

QT_*BUFFER Views
它是高级排队功能的架构设计的一部分,在导出时会创建queue table,然后根据table的object_id 创建 QTNNNN_buffer.
For example:
1) Schema user SCOTT creates a queue_table called QT_1
2) QT_1 object_id=12345
3) An object of type VIEW gets created under schema SYS with name QT12345_BUFFER.
If the queue_table is dropped, so are underlying objects. And SYS.QTxxxxx_BUFFER is one of the underlying objects.

 1* select owner,object_type,object_name,created from dba_objects where object_name like 'QT%BUFF%'

OWNER                OBJECT_TYPE                    OBJECT_NAME                    CREATED
-------------------- ------------------------------ ------------------------------ -----------------
...
SYS                  VIEW                           QT21023_BUFFER                 20190417 01:14:59
SYS                  VIEW                           QT22754_BUFFER                 20190417 01:40:16
SYS                  VIEW                           QT566658_BUFFER                20220712 21:52:55
12 rows selected.
SQL> select owner,object_name,object_type from dba_objects where object_id=566658;
OWNER                OBJECT_NAME                    OBJECT_TYPE
-------------------- ------------------------------ ------------------------------
SYS                  KUPC$DATAPUMP_QUETAB_1         TABLE


  1* select * from dba_queues where queue_table='KUPC$DATAPUMP_QUETAB_1'
OWNER                NAME                           QUEUE_TABLE                           QID QUEUE_TYPE           MAX_RETRIES RETRY_DELAY ENQUEUE DEQUEUE RETENTION                        USER_COMMENT
-------------------- ------------------------------ ------------------------------ ---------- -------------------- ----------- ----------- ------- ------- ---------------------------------------- --------------------------------------------------
NETWORK_NAME
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SHARD
-----
SYS                  AQ$_KUPC$DATAPUMP_QUETAB_1_E   KUPC$DATAPUMP_QUETAB_1             566702 EXCEPTION_QUEUE                0           0   NO      NO    0                                exception queue

FALSE

SQL> exec dbms_aqadm.stop_queue('AQ$_KUPC$DATAPUMP_QUETAB_1_E');
PL/SQL procedure successfully completed.

SQL>  exec dbms_aqadm.drop_queue('AQ$_KUPC$DATAPUMP_QUETAB_1_E');
PL/SQL procedure successfully completed.

SQL> select owner,object_name,object_type from dba_objects where object_id=566658;
OWNER                OBJECT_NAME                    OBJECT_TYPE
-------------------- ------------------------------ ------------------------------
SYS                  KUPC$DATAPUMP_QUETAB_1         TABLE

SQL> exec dbms_aqadm.drop_queue_table('KUPC$DATAPUMP_QUETAB_1');
PL/SQL procedure successfully completed.

--确认qt*buufer 已删除。

NOte:
提示报错的对象存在已删除,重新expdp发现还是报相同的错,更换table mode或导出用户,问题依旧。

根据报错的包调用,查看逻辑, 需要unwrap package body

 PACKAGE BODY dbms_aqadm_inv  AS
    18   OBJECT_ALREADY_EXISTS  EXCEPTION;
   19   PRAGMA                 EXCEPTION_INIT(OBJECT_ALREADY_EXISTS, -955);
   20 
   21   INDEX_DOES_NOT_EXIST EXCEPTION;
   22   PRAGMA EXCEPTION_INIT(INDEX_DOES_NOT_EXIST, -1418);
   23 
   24   NAME_IN_USE EXCEPTION;
   25   PRAGMA EXCEPTION_INIT(NAME_IN_USE, -955);
   
PROCEDURE EXECUTE_STMT2(STMT                   IN  VARCHAR2,  
  418                           IGNORE_DDL_ERROR       IN  BOOLEAN DEFAULT FALSE,
  419                           SYSTEM_CURSOR          IN  BOOLEAN DEFAULT FALSE,
  420                           IGNORE_COMPERR         IN  BOOLEAN DEFAULT FALSE) IS
  421 
  422   STREAMS_CAPTURE_OFF BOOLEAN;
  424   BEGIN
  425     DBMS_AQADM_SYS.WRITE_TRACE(MESSAGE => 'execute_stmt2: '|| STMT);
  426     STREAMS_CAPTURE_OFF := DBMS_AQADM_SYSCALLS.KWQA_3GL_ISSTREAMSCAPTUREOFF;
  427     IF (STREAMS_CAPTURE_OFF = FALSE) THEN
  428       DBMS_AQADM_SYSCALLS.KWQA_3GL_SETSTREAMSCAPTUREOFF(TRUE);
  429     END IF;
  430     
  432     IF SYSTEM_CURSOR THEN
  433       DBMS_AQADM_SYSCALLS.KWQA_3GL_EXECUTESTMT(STMT);  <<<<< 1
  434     ELSE
  435       EXECUTE IMMEDIATE STMT;
  436     END IF;
  437     IF (STREAMS_CAPTURE_OFF = FALSE) THEN
  438       DBMS_AQADM_SYSCALLS.KWQA_3GL_SETSTREAMSCAPTUREOFF(FALSE);
  439     END IF;
  440   EXCEPTION
  441     WHEN TABLE_DOES_NOT_EXIST THEN
  442       IF (STREAMS_CAPTURE_OFF = FALSE) THEN
  443         DBMS_AQADM_SYSCALLS.KWQA_3GL_SETSTREAMSCAPTUREOFF(FALSE);
  444       END IF;
  445       IF NOT IGNORE_DDL_ERROR THEN
  446         RAISE;
  447       END IF;
  448     WHEN INDEX_DOES_NOT_EXIST THEN
  449       IF (STREAMS_CAPTURE_OFF = FALSE) THEN
  450         DBMS_AQADM_SYSCALLS.KWQA_3GL_SETSTREAMSCAPTUREOFF(FALSE);
  451       END IF;
  
         WHEN NAME_IN_USE THEN
  456       IF (STREAMS_CAPTURE_OFF = FALSE) THEN
  457         DBMS_AQADM_SYSCALLS.KWQA_3GL_SETSTREAMSCAPTUREOFF(FALSE);
  458       END IF;
  459       IF NOT IGNORE_DDL_ERROR THEN
  460         RAISE;
  461       END IF;

NOTE:
只附了部分,但是package中没有找到创建两次qt*buffer view的逻辑.

启用10046 trace, 确认是创建了2次qt*view , 看来是调用了2次。19.9的环境从没来遇到过这类问题。MOS中有个类似的BUG 28485837. 但并没有19.9的RU 的one off path.

RU

[oracle@XXX-db OPatch]$ ./opatch lspatches
31668882;OJVM RELEASE UPDATE: 19.9.0.0.201020 (31668882)
31771877;Database Release Update : 19.9.0.0.201020 (31771877)
29585399;OCW RELEASE UPDATE 19.3.0.0.0 (29585399)

OPatch succeeded.

注意, DB HOME中的ocw未安装19.9 补丁,ocw的问题之前遇到过几次,建议安装ocw 19.9,发现安装ocw后问题解决。

打赏

, ,

对不起,这篇文章暂时关闭评论。