Oracle Data Pump Best Practices Speed UP
Exporting BasicFile LOBs can be quite a pain. If your maintenance window is too short, here is a way to make the export faster.When Data Pump starts to export a table with a BasicFile LOB, only one worker will be assigned (due to the limited parallel access methods). If you have a large table, it will take a lot of time for that one worker to process the table.
- For full exports:
– Role EXP_FULL_DATABASE is required
- For export consistency use:
-FLASHBACK_TIME=SYSTIMESTAMP
- Always set parameters:
- Speed up Data Pump:
– PARALLEL=n
- STREAMS_POOL_SIZE
Parameters Affecting Datapump Performance STREAMS_POOL_SIZE。
To avoid “STREAMS AQ: ENQUEUE BLOCKED ON LOW MEMORY” error, you must set the STREAMS_POOL_SIZE value returned by the result set of the following query.
select 'ALTER SYSTEM SET STREAMS_POOL_SIZE='||(max(to_number(trim(c.ksppstvl)))+67108864)||' SCOPE=SPFILE;' from sys.x$ksppi a, sys.x$ksppcv b, sys.x$ksppsv c where a.indx = b.indx and a.indx = c.indx and lower(a.ksppinm) in ('__streams_pool_size','streams_pool_size');
Data Pump uses Advanced Queueing, which relies on the streams pool in the SGA. Be sure that streams_pool_size is set high enough when you start multiple data pump sessions. In most situations, setting it to 2G should be more than enough:
alter system set streams_pool_size=2G scope=memory;
- PARALLEL_FORCE_LOCAL=TRUE
In a RAC environment, it can improve the speed of Data Pump API-based operations and avoid some Parallel DML bugs at the same time.
- Direct import via database link
- Restrictions of database links apply:
- CLUSTER=N
In a RAC environment, it can improve the speed of Data Pump API-based operations.
- On partition table
You try to DataPump import (impdp) a partitioned table into an empty schema using PARALLEL > 1. There are DataPump Worker processes waiting for an exclusive table lock which is indicated by event ‘enq: TM – contention’.
By default, DataPump uses the APPEND hint to force direct loads of the data, since direct loads are faster than conventional loads. One of the reasons direct load is faster is that it locks the table from updates from other sessions. In our case, there are multiple import jobs trying to load data into the same table. When one job gets the lock to do a direct load into a table, all other import jobs are locked out from loading that table until the current job has finished the its current load.
DataPump import (impdp) internally uses a statement (obtained with event 10046) similar to:
INSERT /*+ APPEND PARALLEL("TAB_LP",1)+*/ INTO RELATIONAL("<SCHEMA_NAME>"."<TABLE_NAME>" NOT XMLTYPE) ("<COL1>", "<COL2>", "<COL3>") SELECT "<COL1>", "<COL2>", "<COL3>" FROM "ET$0095000D0001" KU$
A way to force real partition-level load in parallel (INSERT /*+ APPEND PARALLEL(“<TABLE_NAME>”,4)+*/ INTO RELATIONAL(“<SCHEMA_NAME>”.”<TABLE_NAME>” ) PARTITION(XXX) … SELECT … FROM ET$) was introduced in 12.2.0.1 via ER Bug 9590164 : TABLE_EXISTS_ACTION=TRUNCATE|APPEND ON PARTITIONED TABLE => SERIAL IMPORT.
– This enhancement was implemented in 12.2.0.1, with the parameter DATA_OPTIONS=TRUST_EXISTING_TABLE_PARTITIONS.
Note:
There is an enhancement in datapump in version 12.2.0.1 that creates a way to force real partition-level load in parallel
(See the following note: DataPump Import (IMPDP) In Parallel Is Slow When Importing Partitioned Tables (Doc ID 1681444.1) )
ACCESS_METHOD=INSERT_AS_SELECT maybe faster.
Export all of the data in one DataPump job using PARALLEL. The DataPump import will still load one partition at a time, but if the partition partitions are large enough, they will be loaded in parallel.
Use the DATA_OPTIONS=DISABLE_APPEND_HINT option on import. This will cause the loads to conventional load instead of direct load, but it will allow multiple import jobs to load the same table simultaneously.
- Speed up LOB table
The old BasicFile LOBs do not offer any parallel access methods. The new, improved SecureFile LOBs offer superior functionality, including full parallel access to the LOB. This is a good reason to migrate any old BasicFile LOBs in your database.
12c 以后对于securefile 可以并行导入, 但BASCIfile不支持。
The following applies to BasicFile LOBs:
select owner, table_name, column_name
from dba_lobs
where securefile='NO';
Base on ROWID/Key bucket Parallel
Job | Predicate |
---|---|
Data Pump job 1 | where mod(dbms_rowid.rowid_block_number(rowid), 4) = 0 |
Data Pump job 2 | where mod(dbms_rowid.rowid_block_number(rowid), 4) = 1 |
Data Pump job 3 | where mod(dbms_rowid.rowid_block_number(rowid), 4) = 2 |
Data Pump job 4 | where mod(dbms_rowid.rowid_block_number(rowid), 4) = 3 |
Instead of one Data Pump job with one worker processing the table with the LOB, the idea is to start multiple Data Pump exports. Each export has one worker working on a dedicated part of the table.
#!/bin/bash
chunk=10
for ((i=0;i<=9;i++));
do
expdp USERNAME/Password@DB_NAME TABLES=LOB_TEST QUERY=LOB_TEST:\"where mod\(dbms_rowid.rowid_block_number\(rowid\)\, ${chunk}\) = ${i}\" directory=DMP dumpfile=lob_test_${i}.dmp logfile= log_test_${i}.log &
echo $i
done
#!/bin/bash
chunk=10
for ((i=0;i<=9;i++));
do
impdp USERNAME/Password@DB_NAME directory=DMP REMAP_TABLE=LOB_TEST:LOB_TEST remap_schema=source:target dumpfile= lob_test_${i}.dmp logfile=TABLE_imp_log_test_${i}.log DATA_OPTIONS=DISABLE_APPEND_HINT CONTENT=DATA_ONLY &
done
- To change the LOB to SecureFile LOB
When importing, you change the LOB to SecureFile LOB, which allows parallel access. Then, take one Data Pump job at a time.
You should always convert the LOB to SecureFile LOB during import.Conversion happens on-the-fly and allows for parallel import of data. Even if the LOB originally was a BasicFile LOB.
The first job does a regular schema import of the first dump file:
impdp ... \
dumpfile=exp_lob_0.dmp \
logfile=imp_lob_0.log
transform=lob_storage:securefile \
parallel=4
Now, proceed with the other dump files in serial. Since you already converted the LOB to a SecureFile (previous step), you can now perform a Data Pump import using native parallelism:
impdp ... \
dumpfile=exp_lob_1.dmp \
logfile=imp_lob_1.log
parallel=4 \
table_exists_action=append
impdp ... \
dumpfile=exp_lob_2.dmp \
logfile=imp_lob_2.log
parallel=4 \
table_exists_action=append
...
Run the jobs in serial. Each job will use Data Pump parallelism.
The use of DEDUPLICATED LOB columns and Data Pump’s PARALLEL parameter with a value greater than 1
Currently, as of 12.2, the use of the deduplication feature of LOB columns has a limitation on the use of parallelism to access LOB contents which results in a high risk of a deadlock or a hang while the Data Pump API is processing tables using that feature, if the Data Pump API’s PARALLEL parameter is set to a value greater than 1.
Documentation bug 26321833 “DOCUMENT DEDUPLICATED-LOB RESTRICTIONS” which is not published.
NETWORK_LINK + SECUREFILES
The NETWORK_LINK parameter when used to move tables with LOB columns that are configured to use SECUREFILES make for an extremely slow transfer of that LOB data when compared to exporting the LOB containing tables to a file, and it has been observed that using the NETWORK_LINK parameter to move tables with LOB columns defined to use securefiles generates a lot of undo data.
Part of the cause of the difference in speed as of 11.2.0.3 is that distributed transactions is that allocation requests are limited to one block at a time over a database link which means more round trips to get larger data sets transferred.
SECUREFILES (Without NETWORK_LINK)
The storage of LOB column data using the SecureFiles storage format does not allow tables with LOB columns to be exported and imported using parallelism for RDBMS version below 12.2.
The storage of LOB column data using the BasicFiles storage format does not allow tables with LOB columns to be exported or imported using parallelism.
References
https://dohdatabase.com/2023/10/10/data-pump-and-faster-export-of-basicfile-lobs/
目前这篇文章还没有评论(Rss)