Here is a new capability that might be interesting to anybody who is performing a migration using Data Pump. Previously, Data Pump would create indexes one at a time, specifying the PARALLEL keyword for the CREATE INDEX statement to invoke parallel query for index creation. We used to recommend a workaround to create indexes in parallel, which involved a three-step process of importing without indexes, then creating a SQLFILE of the CREATE INDEX statements, and breaking that file into multiple windows.
Through extensive performance testing we found that it is faster to create multiple indexes in parallel (using a parallel degree of 1) instead of creating a single index using parallel query processes. This is enabled by the patch for bug 18793090, superseded by patch 21539301, which is available as a backport for 11.2.0.4 Exadata BP 9, 12.1.0.1.3, or 12.1.0.2. If you need it for another platform, that can of course be requested. The number of indexes created will be based on the PARALLEL parameter.
Here is an example of the effects of this patch on a toy example that I created using our hands-on lab VM environment. I created a table in the SYSTEM schema with 4 columns and 14 indexes, and then inserted a couple of dozen rows into the table. Then I exported the SYSTEM schema from 11.2.0.4 and imported into a 12.1.0.2 PDB with PARALLEL=4, both with and without the patch.
Normal (unpatched) behavior:
;;; Import: Release 12.1.0.2.0 - Production on Thu Apr 9 14:38:50 2015 Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved. ;;; Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options 09-APR-15 14:39:08.602: Startup took 2 seconds 09-APR-15 14:39:12.841: Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded 09-APR-15 14:39:13.417: Starting "SYSTEM"."SYS_IMPORT_FULL_01": system/********@pdb1 directory=temp dumpfile=exp_test.dmp logfile=imp_test1.log logtime=all metrics=yes parallel=4 09-APR-15 14:39:13.605: Processing object type SCHEMA_EXPORT/USER 09-APR-15 14:39:14.454: Completed 1 USER objects in 1 seconds 09-APR-15 14:39:14.470: Processing object type SCHEMA_EXPORT/SYSTEM_GRANT 09-APR-15 14:39:14.541: Completed 5 SYSTEM_GRANT objects in 0 seconds 09-APR-15 14:39:14.596: Processing object type SCHEMA_EXPORT/ROLE_GRANT 09-APR-15 14:39:14.655: Completed 2 ROLE_GRANT objects in 0 seconds 09-APR-15 14:39:14.690: Processing object type SCHEMA_EXPORT/DEFAULT_ROLE 09-APR-15 14:39:14.728: Completed 1 DEFAULT_ROLE objects in 0 seconds 09-APR-15 14:39:14.746: Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA 09-APR-15 14:39:15.275: Completed 1 PROCACT_SCHEMA objects in 1 seconds 09-APR-15 14:39:15.377: Processing object type SCHEMA_EXPORT/TABLE/TABLE 09-APR-15 14:39:15.626: Completed 1 TABLE objects in 0 seconds 09-APR-15 14:39:15.673: Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA 09-APR-15 14:39:16.031: . . imported "SYSTEM"."TAB1" 6.375 KB 12 rows in 1 seconds 09-APR-15 14:39:16.096: Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX 09-APR-15 14:39:20.740: Completed 14 INDEX objects in 4 seconds
With Patch:
;;; Import: Release 12.1.0.2.0 - Production on Thu Apr 9 15:05:19 2015 Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved. ;;; Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options 09-APR-15 15:05:22.590: Startup took 0 seconds 09-APR-15 15:05:23.175: Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded 09-APR-15 15:05:23.613: Starting "SYSTEM"."SYS_IMPORT_FULL_01": system/********@pdb1 directory=temp dumpfile=exp_test.dmp logfile=imp_test3.log logtime=all metrics=yes parallel=4 09-APR-15 15:05:23.699: Processing object type SCHEMA_EXPORT/USER 09-APR-15 15:05:23.862: Completed 1 USER objects in 0 seconds 09-APR-15 15:05:23.882: Processing object type SCHEMA_EXPORT/SYSTEM_GRANT 09-APR-15 15:05:23.937: Completed 5 SYSTEM_GRANT objects in 0 seconds 09-APR-15 15:05:23.993: Processing object type SCHEMA_EXPORT/ROLE_GRANT 09-APR-15 15:05:24.071: Completed 2 ROLE_GRANT objects in 1 seconds 09-APR-15 15:05:24.096: Processing object type SCHEMA_EXPORT/DEFAULT_ROLE 09-APR-15 15:05:24.180: Completed 1 DEFAULT_ROLE objects in 0 seconds 09-APR-15 15:05:24.216: Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA 09-APR-15 15:05:24.378: Completed 1 PROCACT_SCHEMA objects in 0 seconds 09-APR-15 15:05:24.460: Processing object type SCHEMA_EXPORT/TABLE/TABLE 09-APR-15 15:05:24.665: Completed 1 TABLE objects in 0 seconds 09-APR-15 15:05:24.782: Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA 09-APR-15 15:05:25.096: . . imported "SYSTEM"."TAB1" 6.375 KB 12 rows in 1 seconds 09-APR-15 15:05:26.291: Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX 09-APR-15 15:05:26.809: Startup took 4 seconds 09-APR-15 15:05:26.896: Startup took 5 seconds 09-APR-15 15:05:27.138: Startup took 4 seconds 09-APR-15 15:05:28.398: Completed 14 INDEX objects in 3 seconds
A few of things are noteworthy here:
- The indexes took 4.644 seconds without the patch and 3.302 seconds with the patch. So, the effect is significant even on a toy example.
- Those messages about startup taking X seconds are because we did not need the parallel workers for the table data. The startup time probably isn’t correct; that’s just the time between the start of the job and the first use of the worker
- If you apply this patch, we will no longer use PX processes (formerly known as PQ slaves) to create indexes
- Using NETWORK_LINK the patch has no effect [12-Dec-2018]
We got this patch for Linux and applied the same, unfortunately it didnt help and index creation was still serial and PQ processes were being used.
Below SR has all the details
SR 3-10966759491 : 18793090 patch needed on database version 11.2.0.4
Thanks
IK
I was very happy to see this improvement and I downloaded Patch 18793090. Unfortunately it had some issues and Oracle has now remove Patch 18793090 and any information about it.
That patch provided a very important enhancement.
Have a look at SR 3-11193675871 if you can. I’ll email you the information from the SR if it helps. Maybe you can help me get through to the right person over at Oracle.
Thanks.
The original patch has been superseded by a broader patch that enables parallel import of other indexes (such as bitmap indexes) as well. Look for the patch for bug 21539301. The page for patch 18793090 was already updated with this information.
Even after the patch 22273229 , I’m seeing a single index getting created using parallel query processes.
Here is our scenario. We have broken down the schema export/import into separate jobs. So during import time, we are running multiple IMPDP commands at the same time.
So if we add PARALLEL 8 to each of the IMPDP commands, and then during index creation, each index is getting created using PX sessions, it overwhelms the system.
Naveen, pelase ensure that the patch has been applied to the database (by running datapatch) in addition to installing it in the Oracle Home. The most common reason for the behavior you describe is that the patch was installed with opatch, but datapatch was not run on the database.
Hi Mike, are pleased to greet you commented that you install the patch that substituted for the version 11.2.0.4, 22273229.
The installation was carried out on an AIX platform, today at dawn execute the process and you will comment the results.
Thank you for sharing your knowledge.
Hi Mike,
I was importing a 700 Gb dumpfile ,and after applying this patch I was able to save 2 hours ( from 6 to 4 ) for the full import. Not sure if the parallelism is completely out . For tables I prefer PX processes , since I have larger tables and for indexes/constraints this is good.
Hi Karthick,
there are cases when the fix is a big improvement (many smaller indexes) vs cases where the fix doesn’t help (small number of indexes but all very large).
Thanks for your feedback and for sharing your experience!
Mike
Hi Mike.
Does it mean that if, for instance, I have a big table (1Tib) with multiple indexes over it and a couple of not big tables with indexes, then better to run IMPDP without patch for for huge table and indexes, then patch the system and finally run IMPDP for other tables and indexes with patch?
And you say “Through extensive performance testing we found that it is faster to create multiple indexes in parallel (using a parallel degree of 1)”
Is it true for multiple indexes that are based on one table?
Den,
there’s unfortunately no clear YES/NO answer to your question. It depends a lot.
For instance, at a recent project we went without the patch on purpose as we’ve had only a few massive indexes – and the parallel-in-one-index approach went faster than having several indexes built in parallel. It really really really depends on the situation.
Thanks,
Mike
Thank you for this post, it helps! Just wanted to point out that – I noticed that when there are very large indexes being imported, the parallelism automatically changed to less # as compared to when it was importing a bunch of smaller indexes.
Seems this patch does not work with NETWORK_LINK indexes import. There is still one worker.
Den,
yes, I think this is a general limitation. Workaround would be to exclude indexes and build them via a script.
Mike
With NETWORK_LINK the indexes are been created serially, and with PARALLEL 1. A big import I have done recently took 25 hours. 7 hours to process data (all tables started at the same time, and with parallel impdp workers, as searched into ASH) and 18 hours to process indexes (one before the other, with PARALLEL 1).
This is the worst case. Any workaround to this without creating and editing a SQLFILE ?
Luis,
what are your source and destination versions?
Cheers,
Mike
Hi Mike/Roy,
I still prefer the old approach when it comes to large index where indexes are getting created sequentially using parallel clause. It’s much faster to do parallel scans.
Apart from performance issue, one big issue with multiple large indexes getting created simultaneously is excessive use of temporary tablespace. You will need to allocate temp space equivalent to size of top n index (n => value set for parallel) to safely ensure the index creation will succeed.
Looks like in 19c (actually 12.2 onward) this simultaneous index creation is the default behaviour. Is there are parameter where I can get the old behaviour back on 19c?
Thanks,
Ajay
Hi Ajaykumar,
as far as I am aware, there is no way to turn this off or revert. The only thing I haven’t tried is to disable the fix 22273229 with an _fix_control=”22273229:0″ setting in the SPFILE.
Cheers,
Mike
Thanks Mike.
Unfortunately, this bug no. is not visible in v$system_fix_control in 19c. Looks like they have just baked this feature from 12.2 onward and was installable as a patch only on 11.2.0.4 and 12.1.
I don’t think this is going to work as visibility of the patch in v$system_fix_control is one of the pre-requisite to using “_fix_control”.
Since it’s not visible in 11.2.0.4 as well even though the patch is applied I don’t think this method will work in 11.2.0.4 either but at least the patch could be rolled back.
./opatch lsinventory|grep -i applied|grep -i 22273229
Patch 22273229 : applied on Sat Jun 15 11:49:52 GMT 2019
SQL> select * from v$system_fix_control where bugno=22273229;
no rows selected
I am trying to get some workaround via Oracle SR. Keeping my fingers crossed.
Ok – thanks for checking 🙁
Cheers,
Mike
Same problem here: after upgrading from 12.1.0.2 to 19c critical datapump jobs take 2-3x the amount of time than before. Drilldown showed that reason was that indexes were created with PARALLEL 1 and no parallel slave processes.
See SR for more information. Currently waiting for feedback from Development.
SR 3-26429350311 : DataPump Import jobs take 2 times longer in 19c PDB compared to 12.1.0.2 nonCDB
Regards,
Martin
Martin,
Bill will check and get back to you.
Cheers,
Mike
I have similar SR (3-26384234171 : Oracle impdp – import indexes with parallel > 1 in 19c) raised and again waiting for the feedback from development. I hope they can give a patch quickly to revert the behaviour back to pre 12.2 version. The new method just isn’t fit for large tables. And then there are other problems related to increased temp tablespace usage, etc.
Unfortunately this is way more complicated then you would think.
It is discussed actively internally – but this is not something which will happen within the next weeks for sure.
Cheers,
Mike
Sorry, but this is a no brainer situation. The current behavior does not work for big tables: buffer contention will kill performance. This parallel 1 approach is even worse considering big tables are normally partitioned, improving real parallel access.
This sounds like a good idea with a poor implementation. Glad it is being discussed actively, hope to see it improved.
Hi Marcelo,
I know – but I fear there won’t be any change before 23c. Only w/a is to create indexfiles instead and run them the way you’d prefer.
Cheers,
Mike
Hi Mike,
Thanks for hosting this blog. It’s one of the more informative Oracle blogs I follow.
I am currently slogging through a 20TB E-Business Suite database upgrade, platform migration, and character set conversion from 12c to 19c via export/import. The index build on several large tables (billions of rows) is dragging the migration down by a good 24 hours. 99% of the indexes are finished, and ~4 of the 72 Data Pump Workers are running single-threaded index builds on the large tables.
Having skimmed through this thread, I will try the EXCLUDE=INDEX:”LIKE ‘MY_INDEX_PREFIX%'” clause and rebuild the large indexes manually next time. Maybe that will cut our upgrade window down by a day.
Thanks,
Shad
Hi Shad,
it will – and this was the main discussion topic during the Data Pump Virtual Classroom Seminar.
Mike
Hi Mike,
I would like to use impdp to import only index with parallel with the following parameters. Is that correct?
dumpfile=DUMP:Testdump%L.dmp
schemas=TEST
INCLUDE=index
CONTENT=METADATA_ONLY
PARALLEL=8
METRICS=Y
LOGTIME=ALL
What happen if an index already exist?
Thanks!
Hi Jonathan,
you most likely receive an ORA-31684 when the index exists already.
And “PARALLEL=8” will allow several workers to build multiple indexes in parallel.
An alternative to have more control is an indexfile where you have one or more SQL files generated as test files by Data Pump, and then you tune it per need.
Cheers
Mike