Oracle Data Pump Best Practices Speed UP

首页 » ORACLE 9i-23ai » Oracle Data Pump Best Practices Speed UP

Oracle Data Pump Best Practices Speed UP

2025/03/18
ORACLE 9i-23ai
36 views
没有评论

Exporting BasicFile LOBs can be quite a pain. If your maintenance window is too short, here is a way to make the export faster.When Data Pump starts to export a table with a BasicFile LOB, only one worker will be assigned (due to the limited parallel access methods). If you have a large table, it will take a lot of time for that one worker to process the table.

For full exports:

– Role EXP_FULL_DATABASE is required

For export consistency use:

-FLASHBACK_TIME=SYSTIMESTAMP

This will increase UNDO requirements for the duration of the export

Always set parameters:

– EXCLUDE=STATISTICS

– METRICS=YES

Speed up Data Pump:

– PARALLEL=n

Typically n = 2xARALLEL=Number_of_CPUs， You can use parallel to speed up the datapump jobs.

– EXCLUDE=STATISTICS on export

– EXCLUDE=INDEXES on import

1. Initial impdp with EXCLUDE=INDEXES

2. Second impdp with INCLUDE=INDEXES SQLFILE=indexes.sql

3. Split indexes.sql into multiple SQL files and run in multiple sessions

– Set COMMIT_WAIT=NOWAIT and COMMIT_LOGGING=BATCH during full imports

STREAMS_POOL_SIZE

Parameters Affecting Datapump Performance STREAMS_POOL_SIZE。

To avoid “STREAMS AQ: ENQUEUE BLOCKED ON LOW MEMORY” error, you must set the STREAMS_POOL_SIZE value returned by the result set of the following query.

select 'ALTER SYSTEM SET STREAMS_POOL_SIZE='||(max(to_number(trim(c.ksppstvl)))+67108864)||' SCOPE=SPFILE;'
 from sys.x$ksppi a, sys.x$ksppcv b, sys.x$ksppsv c 
where a.indx = b.indx and a.indx = c.indx 
and lower(a.ksppinm) in ('__streams_pool_size','streams_pool_size');

Data Pump uses Advanced Queueing, which relies on the streams pool in the SGA. Be sure that streams_pool_size is set high enough when you start multiple data pump sessions. In most situations, setting it to 2G should be more than enough:
alter system set streams_pool_size=2G scope=memory;

PARALLEL_FORCE_LOCAL=TRUE

In a RAC environment, it can improve the speed of Data Pump API-based operations and avoid some Parallel DML bugs at the same time.

Direct import via database link

– Parameter: NETWORK_LINK

 Run only impdp on the target system – no expdp necessary

 No dump file written, no disk I/O, no file transfer needed

Restrictions of database links apply:

– Does not work with LONG/LONG RAW and certain object types

Performance: Depends on network bandwidth and target’s CPUs

CLUSTER=N

In a RAC environment, it can improve the speed of Data Pump API-based operations.

On partition table

You try to DataPump import (impdp) a partitioned table into an empty schema using PARALLEL > 1. There are DataPump Worker processes waiting for an exclusive table lock which is indicated by event ‘enq: TM – contention’.

By default, DataPump uses the APPEND hint to force direct loads of the data, since direct loads are faster than conventional loads. One of the reasons direct load is faster is that it locks the table from updates from other sessions. In our case, there are multiple import jobs trying to load data into the same table. When one job gets the lock to do a direct load into a table, all other import jobs are locked out from loading that table until the current job has finished the its current load.

DataPump import (impdp) internally uses a statement (obtained with event 10046) similar to:

INSERT /*+ APPEND PARALLEL("TAB_LP",1)+*/ INTO RELATIONAL("<SCHEMA_NAME>"."<TABLE_NAME>" NOT XMLTYPE) ("<COL1>", "<COL2>", "<COL3>")
   SELECT "<COL1>", "<COL2>", "<COL3>"
    FROM "ET$0095000D0001" KU$

A way to force real partition-level load in parallel (INSERT /*+ APPEND PARALLEL(“<TABLE_NAME>”,4)+*/ INTO RELATIONAL(“<SCHEMA_NAME>”.”<TABLE_NAME>” ) PARTITION(XXX) … SELECT … FROM ET$) was introduced in 12.2.0.1 via ER Bug 9590164 : TABLE_EXISTS_ACTION=TRUNCATE|APPEND ON PARTITIONED TABLE => SERIAL IMPORT.
– This enhancement was implemented in 12.2.0.1, with the parameter DATA_OPTIONS=TRUST_EXISTING_TABLE_PARTITIONS.

Note:

There is an enhancement in datapump in version 12.2.0.1 that creates a way to force real partition-level load in parallel

(See the following note: DataPump Import (IMPDP) In Parallel Is Slow When Importing Partitioned Tables (Doc ID 1681444.1) )

ACCESS_METHOD=INSERT_AS_SELECT maybe faster.

Export all of the data in one DataPump job using PARALLEL. The DataPump import will still load one partition at a time, but if the partition partitions are large enough, they will be loaded in parallel.

Use the DATA_OPTIONS=DISABLE_APPEND_HINT option on import. This will cause the loads to conventional load instead of direct load, but it will allow multiple import jobs to load the same table simultaneously.

Speed up LOB table

The old BasicFile LOBs do not offer any parallel access methods. The new, improved SecureFile LOBs offer superior functionality, including full parallel access to the LOB. This is a good reason to migrate any old BasicFile LOBs in your database.

12c 以后对于securefile 可以并行导入，但BASCIfile不支持。

The following applies to BasicFile LOBs:

select owner, table_name, column_name 
from   dba_lobs 
where  securefile='NO';

Base on ROWID/Key bucket Parallel

Job	Predicate
Data Pump job 1	`where mod(dbms_rowid.rowid_block_number(rowid), 4) = 0`
Data Pump job 2	`where mod(dbms_rowid.rowid_block_number(rowid), 4) = 1`
Data Pump job 3	`where mod(dbms_rowid.rowid_block_number(rowid), 4) = 2`
Data Pump job 4	`where mod(dbms_rowid.rowid_block_number(rowid), 4) = 3`

Instead of one Data Pump job with one worker processing the table with the LOB, the idea is to start multiple Data Pump exports. Each export has one worker working on a dedicated part of the table.

#!/bin/bash
chunk=10
for ((i=0;i<=9;i++));
do
expdp USERNAME/Password@DB_NAME TABLES=LOB_TEST QUERY=LOB_TEST:\"where mod\(dbms_rowid.rowid_block_number\(rowid\)\, ${chunk}\) = ${i}\" directory=DMP dumpfile=lob_test_${i}.dmp logfile= log_test_${i}.log &
   echo $i
done

#!/bin/bash
chunk=10
for ((i=0;i<=9;i++));
do
impdp USERNAME/Password@DB_NAME  directory=DMP REMAP_TABLE=LOB_TEST:LOB_TEST  remap_schema=source:target dumpfile= lob_test_${i}.dmp logfile=TABLE_imp_log_test_${i}.log  DATA_OPTIONS=DISABLE_APPEND_HINT  CONTENT=DATA_ONLY &
done

To change the LOB to SecureFile LOB

When importing, you change the LOB to SecureFile LOB, which allows parallel access. Then, take one Data Pump job at a time.

You should always convert the LOB to SecureFile LOB during import.Conversion happens on-the-fly and allows for parallel import of data. Even if the LOB originally was a BasicFile LOB.

The first job does a regular schema import of the first dump file:

impdp ... \
   dumpfile=exp_lob_0.dmp \
   logfile=imp_lob_0.log
   transform=lob_storage:securefile \
   parallel=4

Now, proceed with the other dump files in serial. Since you already converted the LOB to a SecureFile (previous step), you can now perform a Data Pump import using native parallelism:

impdp ... \
   dumpfile=exp_lob_1.dmp \
   logfile=imp_lob_1.log
   parallel=4 \
   table_exists_action=append
impdp ... \
   dumpfile=exp_lob_2.dmp \
   logfile=imp_lob_2.log
   parallel=4 \
   table_exists_action=append...

Run the jobs in serial. Each job will use Data Pump parallelism.

The use of DEDUPLICATED LOB columns and Data Pump’s PARALLEL parameter with a value greater than 1

Currently, as of 12.2, the use of the deduplication feature of LOB columns has a limitation on the use of parallelism to access LOB contents which results in a high risk of a deadlock or a hang while the Data Pump API is processing tables using that feature, if the Data Pump API’s PARALLEL parameter is set to a value greater than 1.
Documentation bug 26321833 “DOCUMENT DEDUPLICATED-LOB RESTRICTIONS” which is not published.

NETWORK_LINK + SECUREFILES

The NETWORK_LINK parameter when used to move tables with LOB columns that are configured to use SECUREFILES make for an extremely slow transfer of that LOB data when compared to exporting the LOB containing tables to a file, and it has been observed that using the NETWORK_LINK parameter to move tables with LOB columns defined to use securefiles generates a lot of undo data.
Part of the cause of the difference in speed as of 11.2.0.3 is that distributed transactions is that allocation requests are limited to one block at a time over a database link which means more round trips to get larger data sets transferred.

SECUREFILES (Without NETWORK_LINK)

The storage of LOB column data using the SecureFiles storage format does not allow tables with LOB columns to be exported and imported using parallelism for RDBMS version below 12.2.
The storage of LOB column data using the BasicFiles storage format does not allow tables with LOB columns to be exported or imported using parallelism.

References
https://dohdatabase.com/2023/10/10/data-pump-and-faster-export-of-basicfile-lobs/

打赏

datapump, expdp, speed up

目前这篇文章还没有评论(Rss)

我要评论

上一篇： PostgreSQL “ERROR: out of shared memory HINT: You might need to increase max_locks_per_transaction”

下一篇： Troubleshooting Oracle Wait Chains ‘buffer busy wait’、’log file sync’、 ‘enq: CF – contention’ during RMAN backup

ANBOB™