Parallel Index Creation with Data Pump Import

Here is a new capability that might be interesting to anybody who is performing a migration using Data Pump. Previously, Data Pump would create indexes one at a time, specifying the PARALLEL keyword for the CREATE INDEX statement to invoke parallel query for index creation. We used to recommend a workaround to create indexes in parallel, which involved a three-step process of importing without indexes, then creating a SQLFILE of the CREATE INDEX statements, and breaking that file into multiple windows.

Through extensive performance testing we found that it is faster to create multiple indexes in parallel (using a parallel degree of 1) instead of creating a single index using parallel query processes. This is enabled by the patch for bug 18793090, superseded by patch 21539301, which is available as a backport for 11.2.0.4 Exadata BP 9, 12.1.0.1.3, or 12.1.0.2. If you need it for another platform, that can of course be requested. The number of indexes created will be based on the PARALLEL parameter.

Here is an example of the effects of this patch on a toy example that I created using our hands-on lab VM environment. I created a table in the SYSTEM schema with 4 columns and 14 indexes, and then inserted a couple of dozen rows into the table. Then I exported the SYSTEM schema from 11.2.0.4 and imported into a 12.1.0.2 PDB with PARALLEL=4, both with and without the patch.

Normal (unpatched) behavior:

;;; 
Import: Release 12.1.0.2.0 - Production on Thu Apr 9 14:38:50 2015
Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.
;;; 
Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
09-APR-15 14:39:08.602: Startup took 2 seconds
09-APR-15 14:39:12.841: Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
09-APR-15 14:39:13.417: Starting "SYSTEM"."SYS_IMPORT_FULL_01":  system/********@pdb1 directory=temp dumpfile=exp_test.dmp logfile=imp_test1.log logtime=all metrics=yes parallel=4 
09-APR-15 14:39:13.605: Processing object type SCHEMA_EXPORT/USER
09-APR-15 14:39:14.454:      Completed 1 USER objects in 1 seconds
09-APR-15 14:39:14.470: Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
09-APR-15 14:39:14.541:      Completed 5 SYSTEM_GRANT objects in 0 seconds
09-APR-15 14:39:14.596: Processing object type SCHEMA_EXPORT/ROLE_GRANT
09-APR-15 14:39:14.655:      Completed 2 ROLE_GRANT objects in 0 seconds
09-APR-15 14:39:14.690: Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
09-APR-15 14:39:14.728:      Completed 1 DEFAULT_ROLE objects in 0 seconds
09-APR-15 14:39:14.746: Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
09-APR-15 14:39:15.275:      Completed 1 PROCACT_SCHEMA objects in 1 seconds
09-APR-15 14:39:15.377: Processing object type SCHEMA_EXPORT/TABLE/TABLE
09-APR-15 14:39:15.626:      Completed 1 TABLE objects in 0 seconds
09-APR-15 14:39:15.673: Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
09-APR-15 14:39:16.031: . . imported "SYSTEM"."TAB1"                             6.375 KB      12 rows in 1 seconds
09-APR-15 14:39:16.096: Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
09-APR-15 14:39:20.740:      Completed 14 INDEX objects in 4 seconds

With Patch:

;;; 
Import: Release 12.1.0.2.0 - Production on Thu Apr 9 15:05:19 2015
Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.
;;; 
Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
09-APR-15 15:05:22.590: Startup took 0 seconds
09-APR-15 15:05:23.175: Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
09-APR-15 15:05:23.613: Starting "SYSTEM"."SYS_IMPORT_FULL_01":  system/********@pdb1 directory=temp dumpfile=exp_test.dmp logfile=imp_test3.log logtime=all metrics=yes parallel=4 
09-APR-15 15:05:23.699: Processing object type SCHEMA_EXPORT/USER
09-APR-15 15:05:23.862:      Completed 1 USER objects in 0 seconds
09-APR-15 15:05:23.882: Processing object type SCHEMA_EXPORT/SYSTEM_GRANT
09-APR-15 15:05:23.937:      Completed 5 SYSTEM_GRANT objects in 0 seconds
09-APR-15 15:05:23.993: Processing object type SCHEMA_EXPORT/ROLE_GRANT
09-APR-15 15:05:24.071:      Completed 2 ROLE_GRANT objects in 1 seconds
09-APR-15 15:05:24.096: Processing object type SCHEMA_EXPORT/DEFAULT_ROLE
09-APR-15 15:05:24.180:      Completed 1 DEFAULT_ROLE objects in 0 seconds
09-APR-15 15:05:24.216: Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA
09-APR-15 15:05:24.378:      Completed 1 PROCACT_SCHEMA objects in 0 seconds
09-APR-15 15:05:24.460: Processing object type SCHEMA_EXPORT/TABLE/TABLE
09-APR-15 15:05:24.665:      Completed 1 TABLE objects in 0 seconds
09-APR-15 15:05:24.782: Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA
09-APR-15 15:05:25.096: . . imported "SYSTEM"."TAB1"                             6.375 KB      12 rows in 1 seconds
09-APR-15 15:05:26.291: Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
09-APR-15 15:05:26.809: Startup took 4 seconds
09-APR-15 15:05:26.896: Startup took 5 seconds
09-APR-15 15:05:27.138: Startup took 4 seconds
09-APR-15 15:05:28.398:      Completed 14 INDEX objects in 3 seconds

A few of things are noteworthy here:

  1. The indexes took 4.644 seconds without the patch and 3.302 seconds with the patch. So, the effect is significant even on a toy example.
  2. Those messages about startup taking X seconds are because we did not need the parallel workers for the table data. The startup time probably isn’t correct; that’s just the time between the start of the job and the first use of the worker
  3. If you apply this patch, we will no longer use PX processes (formerly known as PQ slaves) to create indexes
If you take advantage of this patch, let me know the results in a reply here! We are excited to be able to add a bit more parallelism into Data Pump, and plan on more for the future.

8 thoughts on “Parallel Index Creation with Data Pump Import

  1. We got this patch for Linux and applied the same, unfortunately it didnt help and index creation was still serial and PQ processes were being used.

    Below SR has all the details

    SR 3-10966759491 : 18793090 patch needed on database version 11.2.0.4

    Thanks
    IK

  2. I was very happy to see this improvement and I downloaded Patch 18793090. Unfortunately it had some issues and Oracle has now remove Patch 18793090 and any information about it.

    That patch provided a very important enhancement.

    Have a look at SR 3-11193675871 if you can. I’ll email you the information from the SR if it helps. Maybe you can help me get through to the right person over at Oracle.

    Thanks.

  3. The original patch has been superseded by a broader patch that enables parallel import of other indexes (such as bitmap indexes) as well. Look for the patch for bug 21539301. The page for patch 18793090 was already updated with this information.

  4. Even after the patch 22273229 , I’m seeing a single index getting created using parallel query processes.

    Here is our scenario. We have broken down the schema export/import into separate jobs. So during import time, we are running multiple IMPDP commands at the same time.

    So if we add PARALLEL 8 to each of the IMPDP commands, and then during index creation, each index is getting created using PX sessions, it overwhelms the system.

  5. Naveen, pelase ensure that the patch has been applied to the database (by running datapatch) in addition to installing it in the Oracle Home. The most common reason for the behavior you describe is that the patch was installed with opatch, but datapatch was not run on the database.

  6. Hi Mike, are pleased to greet you commented that you install the patch that substituted for the version 11.2.0.4, 22273229.
    The installation was carried out on an AIX platform, today at dawn execute the process and you will comment the results.
    Thank you for sharing your knowledge.

  7. Hi Mike,

    I was importing a 700 Gb dumpfile ,and after applying this patch I was able to save 2 hours ( from 6 to 4 ) for the full import. Not sure if the parallelism is completely out . For tables I prefer PX processes , since I have larger tables and for indexes/constraints this is good.

    • Hi Karthick,

      there are cases when the fix is a big improvement (many smaller indexes) vs cases where the fix doesn’t help (small number of indexes but all very large).
      Thanks for your feedback and for sharing your experience!

      Mike

Leave a Reply

Your email address will not be published. Required fields are marked *