Troubleshooting Oracle 12c 导出失败expdp ORA-31626 ORA-31637 ORA-12805, exp EXP-00064
最近在遇到一套Oracle 12c R2 RAC多租户+ADG的环境中,遇到了一些导出数据的问题。导出使用的是expdp命令,但是遇到了ORA-31626、ORA-31637和ORA-12805的错误。尝试改用exp命令时,也出现了一些奇怪的现象,部分表报EXP-00064错误,尽管这些表并不是嵌套表,并且使用CTAS创建一个复制表时可以正常导出,但使用expdp命令导出时仍然报错。这些表的列类型只是number、char和blob。经过反复诊断测试找到问题并还原,简单记录。
EXPDP错误
DataPump export (expdp) 或如下错误:
ORA-31637: cannot create job SYS_EXPORT_SCHEMA for user SCOTT
ORA-06512: at “SYS.DBMS_SYS_ERROR”, line 95
ORA-06512: at “SYS.KUPV$FT_INT”, line 600
ORA-12805: parallel query server died unexpectedly
MOS中有记录是一个已知原因是 DataPump 安装存在问题。 要么是一开始没安装好,要么是后来出了什么问题。解决方法建议重新reload datapump.方法如下:
On Multitenant Environment, for Oracle versions 12.1, 12.2, 18c, 19c, 21c, and 23c:
-- 1. To rebuild the DataPump packages with the following steps. Under the ORACLE_HOME, execute: cd rdbms/admin-- run the dpload.sql in the CDB with all of the PDBs open, as described in "How to execute sql scripts in Multitenant environment (catcon.pl)" NOTE 1932340.1Syntax: $ORACLE_HOME/perl/bin/perl catcon.pl -d /u01/oracle -l <logging directory> -b <prefix for logfile of dpload for each PDB> dpload.sqlWhere: -l - directory to use for spool log files; for example, /tmp -b - base name for log and spool file names; for example, dp_rebuild_log_base_nameFor example, at OS promptcd $ORACLE_HOME/rdbms/admin $ORACLE_HOME/perl/bin/perl catcon.pl -d /u01/oracle -l /tmp -b dp_rebuild_log_base_name dpload.sql
NOTE:
After dpload.sql is run via caton.pl, to recompile any invalid objects, utlrp.sql can be run in all containers vi catcon.pl as well, e.g.
host perl $ORACLE_HOME/rdbms/admin/catcon.pl -n 1 -l /path_for_logs -b name_for_logs $ORACLE_HOME/rdbms/admin/utlrp.sql;
This command can be run several times to compile objects which did not get processed during the first run.
Also, if there are Data Pump components that are still invalid in any one container, utlrp.sql can be run individually in that container, e.g.
connect / as sysdba alter session set container = <PDB_NAME> @utlrp.sql
在测试环境dpload正常,虽然不需要重启实例,但还是担心对生产影响,因生产无维护窗口,暂时先不尝试reload,改用exp。(虽然exp已不建议使用)
EXP错误
改用exp导出格式如 exp xxx@xx file=xxx tables=xxx.yy buffers=1000000
但是CTAS一个复制表,就可以正常导出,同时库里有其它表试出不可以导出。
EXP sql trace跟踪
因为部分表失败,对导出失败的表做10046 sql trace跟踪,进入exp交互模式启用10046, 或使用login triiger.或使用event++ 语法:
alter system set events 'sql_trace {process:pname=cjq|pname=j0}'; -- 11g syntax alter system set events 'sql_trace {process:pname=cjq|process:pname=j0}'; -- 12c syntax alter system set events '10046{process: pname=exp} /* target exp */
这里我们使用交互的方式过去exp进程启用sql trace,再exp导出。
1. 启动exp交互:
# exp username/password
2. 去SQL*Plus查看exp的进程信息
SELECT p.spid, p.addr, p.pid, s.sid, SUBSTR (s.username, 1, 15) "USERNAME", SUBSTR (s.program, 1, 15) "PROGRAM" FROM v$process p,v$session s WHERE s.paddr=p.addr AND addr = (SELECT paddr FROM v$session WHERE UPPER(program) LIKE 'EXP%');
3. 在SQL*Plus启用10046 events 跟踪
SQL> oradebug setospid xxx (xxx = spid from above query) SQL> oradebug unlimit SQL> oradebug tracefile_name SQL> oradebug Event 10046 trace name context forever, level 12
4. exp交互输入,导致Table 模式,输入表名,交互,直到报错
exp trace file
===================== PARSING IN CURSOR #140036662172504 len=643 dep=0 uid=9 oct=3 lid=9 tim=133633457005340 hv=4183964855 ad='4c2b909488' sqlid='7vh54fmwq4g5r' SELECT OBJID, NAME, OWNER, OWNERID, TABLESPACE, TSNO, FILENO, BLOCKNO, AUDIT$, COMMENT$, CLUSTERFLAG, PCTFREE$, PCTUSED$, INITRANS, MAXTRANS, DEGREE, INSTANCES, CACHE, PROPERTY, DEFLOG, TSDEFLOG, ROID, ROWCNT, BLKCNT, AVGRLEN, TFLAGS, TRIGFLAG, OBJSTATUS, XDBOOL, DOBJID FROM SYS.EXU10TABS WHERE NAME = :1 AND SECONDARYOBJ = :2 AND OWNERID = :3 AND NOT EXISTS ( SELECT NAME FROM SYS.EXU8NXPN$ WHERE N$.OWNERID = :3 AND N$.NAME = :1 AND N$.TYPE = 2) END OF STMT BINDS #140036662172504: Bind#0 oacdty=01 mxl=128(38) mxlc=00 mal=00 scl=00 pre=00 oacflg=00 fl2=1000000 frm=01 csi=852 siz=176 off=0 kxsbbbfp=7f5cd38108d8 bln=128 avl=19 flg=05 value="Nxxxxxx_BLOB" Bind#1 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=00 fl2=1000000 frm=00 csi=00 siz=0 off=128 kxsbbbfp=7f5cd3810958 bln=22 avl=01 flg=01 value=0 Bind#2 oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00 oacflg=00 fl2=1000000 frm=00 csi=00 siz=0 off=152 kxsbbbfp=7f5cd3810970 bln=22 avl=03 flg=01 value=110 Bind#3 No oacdef for this bind. Bind#4 No oacdef for this bind.
— catexp.sql 中两个view, 用于exp
SYS.EXU10TABS — 导出表
SYS.EXU8NXP –排除表
对比了不能导出的表和能正常导出表以上查询记录,仅PROPERTY关键列不同,查询EXU10TABS的定义,来自TAB$。
PROPERTY 值
-- 不正常表 SQL> @dec 1180591620717948438528 DEC HEX ----------------------------------- -------------------- 1180591620717948438528.000000 400000000020040800 -- 正常表 SQL> @dec 537135104 DEC HEX ----------------------------------- -------------------- 537135104.000000 20040800 -- dcore.bsq property number not null, /* table properties (bit flags): */ /* 0x01 = typed table, 0x02 = has ADT columns, */ /* 0x04 = has nested-TABLE columns, 0x08 = has REF columns, */ /* 0x10 = has array columns, 0x20 = partitioned table, */ /* 0x40 = index-only table (IOT), 0x80 = IOT w/ row OVerflow, */ /* 0x100 = IOT w/ row CLustering, 0x200 = IOT OVeRflow segment, */ /* 0x400 = clustered table, 0x800 = has internal LOB columns, */ /* 0x1000 = has primary key-based OID$ column, 0x2000 = nested table */ /* 0x4000 = View is Read Only, 0x8000 = has FILE columns */ /* 0x10000 = obj view's OID is system-gen, 0x20000 = used as AQ table */ /* 0x40000 = has user-defined lob columns */ /* 0x00080000 = table contains unused columns */ /* 0x100000 = has an on-commit materialized view */ /* 0x200000 = has system-generated column names */ /* 0x00400000 = global temporary table */ /* 0x00800000 = session-specific temporary table */ /* 0x08000000 = table is a sub table */ /* 0x20000000 = pdml itl invariant */ /* 0x80000000 = table is external */ /* PFLAGS2: 0x400000000 = delayed segment creation */ /* PFLAGS2: 0x20000000000 = result cache mode FORCE enabled on this table */ /* PFLAGS2: 0x40000000000 = result cache mode MANUAL enabled on this table */ /* PFLAGS2: 0x80000000000 = result cache mode AUTO enabled on this table */ /* PFLAGS2: 0x400000000000000 = has identity column */
0x400000000000000000没有找到说明,不过这个文件应该是很久没有更新了,不确认是更新错误还是某个新特性,不过可以确认dcore.bsq的值列的不全,比如我们创建blockchain table, 在19c(19.19+ )/23c 可以测试。
SQL> CREATE BLOCKCHAIN TABLE bank_ledger (bank VARCHAR2(128), deposit_date DATE, deposit_amount NUMBER) 2 NO DROP UNTIL 31 DAYS IDLE 3 NO DELETE LOCKED 4 HASHING USING "SHA2_512" VERSION "v1"; Table created. SQL> select object_id from dba_objects where object_name = 'BANK_LEDGER'; OBJECT_ID ---------- 231182 SQL> SELECT OBJ#, TO_CHAR(PROPERTY) FROM TAB$ WHERE OBJ# = 231182; OBJ# TO_CHAR(PROPERTY) ---------- -------------------------------------------------------------------------------- 231182 664613997892457936451903530677043200 SQL> select to_char(664613997892457936451903530677043200, 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX') FROM DUAL; TO_CHAR(664613997892457936451903530677043200,'XXXXXXXXXXXXXXXXXXXXXX -------------------------------------------------------------------- 800000000000000000000020000000
0x800000000000000000000000000000这个值也没有在文件中说明,在测试环境创建一个表,PROPERTY 增加0x400000000000000000尝试.
-- 测试 SQL> @o t209 owner object_name subname object_type status OID D_OID CREATED LAST_DDL_TIME ------------------------- ------------------------------ ------------------------------ -------------------- --------- ---------- ---------- ------------------- ------------------- ANBOB T209 TABLE VALID 80562 80562 2024-01-19 13:56:15 2024-01-19 14:01:05 SQL> select property,obj#,dataobj# from sys.tab$ where obj#=80562; PROPERTY OBJ# DATAOBJ# --------------- --------------- --------------- 17717004288 80562 80562 SQL> @dec 17717004288 DEC HEX ----------------------------------- -------------------- 17717004288.000000 420040800 SQL> @hex 400000000420040800 DEC HEX ----------------------------------- -------------------- 1180591620735128307712.000000 400000000420040800 SQL> update sys.tab$ set property=1180591620735128307712 where obj#=80562; 1 row updated. SQL> commit; Commit complete. [oracle@oel7db1 admin]$ exp anbob/oracle@cdb1pdb1 tables=t209 file=test209.dmp Export: Release 19.0.0.0.0 - Production on Fri Jan 19 14:09:11 2024 Version 19.3.0.0.0 Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved. Connected to: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.3.0.0.0 Export done in US7ASCII character set and AL16UTF16 NCHAR character set server uses AL32UTF8 character set (possible charset conversion) About to export specified tables via Conventional Path ... EXP-00064: T209 is an inner nested table and cannot be exported. Export terminated successfully with warnings. SQL> update sys.tab$ set property=17717004288 where obj#=80562; 1 row updated. SQL> commit; Commit complete. [oracle@oel7db1 admin]$ exp anbob/oracle@cdb1pdb1 tables=t209 file=test209.dmp Export: Release 19.0.0.0.0 - Production on Fri Jan 19 14:12:09 2024 Version 19.3.0.0.0 Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved. Connected to: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.3.0.0.0 Export done in US7ASCII character set and AL16UTF16 NCHAR character set server uses AL32UTF8 character set (possible charset conversion) About to export specified tables via Conventional Path ... . . exporting table T209 0 rows exported Export terminated successfully without warnings. [oracle@oel7db1 admin]$
Note:
还原了该问题EXP-00064 。手动update tab$恢复原值,exp正常。 注 不要在对未知操作更新dict. 我们前期已跑过hcheck.sql 无dict 逻辑错误,但对于值的错误是hcheck无法检查的。注意上面的操作我并未flush shared pool,对于exp是实时生效,所以也可以尝试在exp后update会原值。
alter table move online会不会修正该值?
SQL> alter table anbob.t209 move online; Table altered. SQL> @o t209 no rows selected SQL> @o anbob.t209 owner object_name subname object_type status OID D_OID CREATED LAST_DDL_TIME ------------------------- ------------------------------ ------------------------------ -------------------- --------- --------------- --------------- ------------------- ------------------- ANBOB T209 TABLE VALID 80562 80565 2024-01-19 13:56:15 2024-01-19 14:19:21 SQL> select property,obj#,dataobj# from sys.tab$ where obj#=80562 2 ; PROPERTY OBJ# DATAOBJ# --------------- --------------- --------------- 17717004288 80562 80565 SQL> update sys.tab$ set property=1180591620735128307712 where obj#=80562; 1 row updated. SQL> commit; Commit complete. SQL> alter table anbob.t209 move online; Table altered. SQL> select property,obj#,dataobj# from sys.tab$ where obj#=80562; PROPERTY OBJ# DATAOBJ# --------------- --------------- --------------- 17717004288 80562 80566 SQL> update sys.tab$ set property=1180591620735128307712 where obj#=80562; 1 row updated. SQL> commit; Commit complete. SQL> alter system flush shared_pool; System altered. SQL> select property,obj#,dataobj# from sys.tab$ where obj#=80562; PROPERTY OBJ# DATAOBJ# --------------- --------------- --------------- 1.180591621E+21 80562 80566 SQL> alter table anbob.t209 move online; Table altered. SQL> select property,obj#,dataobj# from sys.tab$ where obj#=80562; PROPERTY OBJ# DATAOBJ# --------------- --------------- --------------- 1.180591621E+21 80562 80567
Note:
alter table move online并不会修正PROPERTY值,而是复用内存中的值,而不是update后未flush shared pool的值。
如果update后有没有办法不flush shared pool而让内存中的对象生效呢? 恐怕要做一个rename或修改表结构的DDL, 这里我们可以考虑增删一个隐藏列。
SQL> alter table anbob.t209 add id3 int invisible default 0 not null ; SQL> alter table anbob.t209 drop column id3;
Summary:
12c的数据库因为之前安装的datapump问题,导致expdp无法使用,需要datapump reload,但无法确认其影响,改用exp时发现部分表导出错误的提示exp-64诊断是嵌套表,但实际并不是,从standby database导出一样的报错,后分析为tab$.property值错误(不确认原因), 经测试导出报错的表均是该问题,手动修改后可以解决。
— over —
对不起,这篇文章暂时关闭评论。