Oracle has introduced data pump in version 10g. Data pump enables very high speed data and metadata loading and unloading of oracle databases.
Oracle has added some new features in 11g for data pump utility. I haven’t shared my experience using data pump, let me talk some data pump features and move on to 11g enhancements.
It was a big relief for many DBA’s and Database developers when the data pump feature was made available in Oracle10g. I used to work in an environment where the database size was around 20TB. A application uses a schema in the database which was around 400GB. The schema data is scattered in multiple table spaces which are shared by other schemas (Ashamed of myself to be a DBA who supports this Database). I get requests to refresh this schema from production. Since my data is not logically separated, I often tend to use the export/import feature. This is a week long process with some baby sitting.
When we upgraded the database to Oracle10g, I really wanted to try out the Data pump option. The result was amazing. It was blazing fast compared to 7 days of traditional export/import. I could complete the data pump export/import in 10-12 hours.
I have fallen in love with Data pump!!!.
Oracle 10g has provided two command line tools named “expdp” and “impdp”. These tools support all the functionalities of the original exp and imp tools as well as some new features.
Data Pump export is a utility for unloading data and metadata into a set of operating system files called dump file sets. Data pump import is used to load metadata and data in export dump files to a target database.
The heart of the data pump operation is the master table, which is a table created in the schema of the user executing the job. The master table maintains all the aspects of the job. The master table helps the data pump to get the status of a job and restart it, in the event of a abnormal stoppage. The master table gets dropped when the job finishes normally.
The datapump actually runs on the database server. You can intiate the process from a client machine and exit out the process. The job still runs on the database server. You have the capability to attach to the job and check the status at any point as long as the job is running.
Let us see some examples for datapump export and import.
You need to create a logical directory for the datapump job to place the dump files and log files.
$> sqlplus / as sysdba
sql> CREATE DIRECTORY export_dir as '\u01\oracle\export_dir';
sql> GRANT READ, WRITE ON directory export_dir TO patent; (here patent is the name of the database user who is executing the expdp job)
Start the data pump export
Eg: to export Just one table from schema Patent
nohup expdp patent/xxxxxxx@binfo directory=export_dir tables=SEQUENCE_QUALITY job_name=patent_exp dumpfile=export_dir:sequence_quality_%u.dmp logfile=export_dir:sequence_quality.log filesize=80530636800 status=5 parallel=5 &
Where filesize is the max size the dumpfile can grow. when it reaches the max size, it automaticllay creates a new file and write to it.
Job_name is required for you to attach to the job again, if you disconnect from the session.
If you need to import the above export dump file in to another schema..(patent_test).
$> sqlplus / as sysdba
sql> CREATE DIRECTORY imp_dir as '\u01\oracle\imp_dir';
sql> GRANT READ, WRITE ON directory imp_dir TO patent_test; (here patent_test is the name of the database user who is executing the impdp job)
nohup impdp patent_test/xxxxxxx@dev directory=imp_dir job_name=patent_import dumpfile=imp_dir:sequence_quality_%u.dmp logfile=imp_dir:patentimp.log remap_schema=PATENT:PATENT_TEST status=5 remap_tablespace=seqdb_quality_data:SMS_DATA full=y &
Here the key words remap_schema tells the jobs that the export was done on schema patent and the import data has to go to schema "patent_test".
The key word remap_tablespace tells the jobs to move all the objects under seqdb_quality_data tablespace on the export database to sms_data tablespace.
If you need to skip constraints and indexes on the same impdp job.
nohup impdp patent_test/********@dev directory=imp_dir job_name=patent_import dumpfile=imp_dir:sequenceS_%u.dmp logfile=imp_dir:patentimp.log remap_schema=PATENT:PATENT_TEST status=5 remap_tablespace=seqdb_sequences_data:SMS_DATA full=y exclude=GRANT,CONSTRAINT,INDEX parallel=5 &
Like in Exp and imp job you can still use par files here.
sample Parfile
--------------
import
------
directory=IMP_DIR
dumpfile=IMP_DIR:ecg%U.dmp
job_name=ecg_import
logfile=imp_dir:ecgimp.log
status=5
parallel=4
TABLES=ECG.TRAN_ERR
CONTENT=DATA_ONLY
TABLE_EXISTS_ACTION=APPEND
REMAP_SCHEMA=ECG:ECG_TEST
Export
------
directory=EXP_DIR
dumpfile=EXP_DIR:ecg%u
dmpjob_name=ecg_export
logfile=exp_dir:ecgexp.
logfilesize=10G
status=5
parallel=4
exclude=index
TABLES=TRAN_ERR,
TRAN_DETAIL_STATUS,
TRAN_HEADER_STATUS,
TRAN_PARTY
Usage
------
$> nohup expdp patent/xxxxxxx@binfo parfile=exp.par &
$> nohup impdp patent/xxxxxxx@binfo parfile=imp.par &
How do you attach to the above impdp job from another client computer and watch the status.
$> impdp patent_test/********@dev attach=patent_import --> (name of the job)
Connected to: Oracle10i Enterprise Edition Release 10.1.0.0.0 - 64bit Beta
With the Partitioning, Oracle Label Security, OLAP and Oracle Data Mining optios
Job: patent_imp
Owner: patent_test
Operation: import
Creator Privs: FALSE
GUID: H245BD7LM2C78A6EE03400306E0F6291
Start Time: Monday, 02 March, 2009 3:11
Mode: SCHEMA
Instance: dev
MaxDegree: 1
Import Job Parameters:
Parameter Name Parameter Value:
CLIENT_COMMAND patent_test/******** job_name=patent_imp directory=imp_dir
DATA_ACCESS_METHOD AUTOMATIC
ESTIMATE BLOCKS
INCLUDE_METADATA 1
KEEP_MASTER 0
LOG_FILE_DIRECTORY IMP_DIR
LOG_FILE_NAME patentimp.log
TABLE_CONSISTENCY 0
Job: PATENT_IMP
Operation: IMPORT
Mode: SCHEMA
State: EXECUTING
Degree: 1
Job Error Count: 0
Dump file: /u01/oracle/imp_dir/sequences_01.dmp
bytes written = 2048
Worker 1 Status:
State: EXECUTING
Import>
How do you stop/start a job.
Attach to the executing job first.
$>impdp patent_test/********@dev attach=patent_import
Connected to: Oracle10i Enterprise Edition Release 10.1.0.0.0 - 64bit Beta
With the Partitioning, Oracle Label Security, OLAP and Oracle Data Mining optios
Job: patent_imp
Owner: patent_test
Operation: import
Creator Privs: FALSE
GUID: H245BD7LM2C78A6EE03400306E0F6291
Start Time: Monday, 02 March, 2009 3:11
Mode: SCHEMA
Instance: dev
MaxDegree: 1
Import Job Parameters:
Parameter Name Parameter Value:
CLIENT_COMMAND patent_test/******** job_name=patent_imp directory=imp_dir
DATA_ACCESS_METHOD AUTOMATIC
ESTIMATE BLOCKS
INCLUDE_METADATA 1
KEEP_MASTER 0
LOG_FILE_DIRECTORY IMP_DIR
LOG_FILE_NAME patentimp.log
TABLE_CONSISTENCY 0
Job: PATENT_IMP
Operation: IMPORT
Mode: SCHEMA
State: EXECUTING
Degree: 1
Job Error Count: 0
Dump file: /u01/oracle/imp_dir/sequences_01.dmp
bytes written = 2048
Worker 1 Status:
State: EXECUTING
Import> stop_job (to stop a running Job)
Are you sure you wish to stop this job ([y]/n): y
To start a Job
Import> start_job
Import> status (to view the status of the current job.
How to terminate a Running Job
Attach to the running job first
Import> kill_job
Are you sure you wish to stop this job ([y]/n): y
$
Once the job is killed the master table gets dropped and it automatically detaches all clinet sessions.
You can view the status of a Data Pump Job from dba_datapump_jobs
Conclusion
Datapump is great addition to Oracle tools. The best part is , its very easy to use and the commands are very similar to the traditional export/import tools. I'll talk about some of the 11g datapump enhancements in next post.
1 comment:
This is indeed very detailed and in-depth know how on how "Data Pump" works. The examples show light easily to DBAs who are struggling to be proficient in using this tool. I recommend more people use this site as a tool to learn about this new feature.
Post a Comment