Here is a new capability that might be interesting to anybody who is performing a migration using Data Pump. Previously, Data Pump would create indexes one at a time, specifying the PARALLEL keyword for the CREATE INDEX statement to invoke parallel query for index creation. We used to recommend a workaround to create indexes in parallel, which involved a three-step process of importing without indexes, then creating a SQLFILE of the CREATE INDEX statements, and breaking that file into multiple windows.
Through extensive performance testing we found that it is faster to create multiple indexes in parallel (using a parallel degree of 1) instead of creating a single index using parallel query processes. This is enabled by the patch for bug 18793090, superseded by patch 21539301, which is available as a backport for 22.214.171.124 Exadata BP 9, 126.96.36.199.3, or 188.8.131.52. If you need it for another platform, that can of course be requested. The number of indexes created will be based on the PARALLEL parameter.
Here is an example of the effects of this patch on a toy example that I created using our hands-on lab VM environment. I created a table in the SYSTEM schema with 4 columns and 14 indexes, and then inserted a couple of dozen rows into the table. Then I exported the SYSTEM schema from 184.108.40.206 and imported into a 220.127.116.11 PDB with PARALLEL=4, both with and without the patch.
Normal (unpatched) behavior:
;;; Import: Release 18.104.22.168.0 - Production on Thu Apr 9 14:38:50 2015 Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved. ;;; Connected to: Oracle Database 12c Enterprise Edition Release 22.214.171.124.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options 09-APR-15 14:39:08.602: Startup took 2 seconds 09-APR-15 14:39:12.841: Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded 09-APR-15 14:39:13.417: Starting "SYSTEM"."SYS_IMPORT_FULL_01": system/********@pdb1 directory=temp dumpfile=exp_test.dmp logfile=imp_test1.log logtime=all metrics=yes parallel=4 09-APR-15 14:39:13.605: Processing object type SCHEMA_EXPORT/USER 09-APR-15 14:39:14.454: Completed 1 USER objects in 1 seconds 09-APR-15 14:39:14.470: Processing object type SCHEMA_EXPORT/SYSTEM_GRANT 09-APR-15 14:39:14.541: Completed 5 SYSTEM_GRANT objects in 0 seconds 09-APR-15 14:39:14.596: Processing object type SCHEMA_EXPORT/ROLE_GRANT 09-APR-15 14:39:14.655: Completed 2 ROLE_GRANT objects in 0 seconds 09-APR-15 14:39:14.690: Processing object type SCHEMA_EXPORT/DEFAULT_ROLE 09-APR-15 14:39:14.728: Completed 1 DEFAULT_ROLE objects in 0 seconds 09-APR-15 14:39:14.746: Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA 09-APR-15 14:39:15.275: Completed 1 PROCACT_SCHEMA objects in 1 seconds 09-APR-15 14:39:15.377: Processing object type SCHEMA_EXPORT/TABLE/TABLE 09-APR-15 14:39:15.626: Completed 1 TABLE objects in 0 seconds 09-APR-15 14:39:15.673: Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA 09-APR-15 14:39:16.031: . . imported "SYSTEM"."TAB1" 6.375 KB 12 rows in 1 seconds 09-APR-15 14:39:16.096: Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX 09-APR-15 14:39:20.740: Completed 14 INDEX objects in 4 seconds
;;; Import: Release 126.96.36.199.0 - Production on Thu Apr 9 15:05:19 2015 Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved. ;;; Connected to: Oracle Database 12c Enterprise Edition Release 188.8.131.52.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options 09-APR-15 15:05:22.590: Startup took 0 seconds 09-APR-15 15:05:23.175: Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded 09-APR-15 15:05:23.613: Starting "SYSTEM"."SYS_IMPORT_FULL_01": system/********@pdb1 directory=temp dumpfile=exp_test.dmp logfile=imp_test3.log logtime=all metrics=yes parallel=4 09-APR-15 15:05:23.699: Processing object type SCHEMA_EXPORT/USER 09-APR-15 15:05:23.862: Completed 1 USER objects in 0 seconds 09-APR-15 15:05:23.882: Processing object type SCHEMA_EXPORT/SYSTEM_GRANT 09-APR-15 15:05:23.937: Completed 5 SYSTEM_GRANT objects in 0 seconds 09-APR-15 15:05:23.993: Processing object type SCHEMA_EXPORT/ROLE_GRANT 09-APR-15 15:05:24.071: Completed 2 ROLE_GRANT objects in 1 seconds 09-APR-15 15:05:24.096: Processing object type SCHEMA_EXPORT/DEFAULT_ROLE 09-APR-15 15:05:24.180: Completed 1 DEFAULT_ROLE objects in 0 seconds 09-APR-15 15:05:24.216: Processing object type SCHEMA_EXPORT/PRE_SCHEMA/PROCACT_SCHEMA 09-APR-15 15:05:24.378: Completed 1 PROCACT_SCHEMA objects in 0 seconds 09-APR-15 15:05:24.460: Processing object type SCHEMA_EXPORT/TABLE/TABLE 09-APR-15 15:05:24.665: Completed 1 TABLE objects in 0 seconds 09-APR-15 15:05:24.782: Processing object type SCHEMA_EXPORT/TABLE/TABLE_DATA 09-APR-15 15:05:25.096: . . imported "SYSTEM"."TAB1" 6.375 KB 12 rows in 1 seconds 09-APR-15 15:05:26.291: Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX 09-APR-15 15:05:26.809: Startup took 4 seconds 09-APR-15 15:05:26.896: Startup took 5 seconds 09-APR-15 15:05:27.138: Startup took 4 seconds 09-APR-15 15:05:28.398: Completed 14 INDEX objects in 3 seconds
A few of things are noteworthy here:
- The indexes took 4.644 seconds without the patch and 3.302 seconds with the patch. So, the effect is significant even on a toy example.
- Those messages about startup taking X seconds are because we did not need the parallel workers for the table data. The startup time probably isn’t correct; that’s just the time between the start of the job and the first use of the worker
- If you apply this patch, we will no longer use PX processes (formerly known as PQ slaves) to create indexes
- Using NETWORK_LINK the patch has no effect [12-Dec-2018]
We got this patch for Linux and applied the same, unfortunately it didnt help and index creation was still serial and PQ processes were being used.
Below SR has all the details
SR 3-10966759491 : 18793090 patch needed on database version 184.108.40.206
I was very happy to see this improvement and I downloaded Patch 18793090. Unfortunately it had some issues and Oracle has now remove Patch 18793090 and any information about it.
That patch provided a very important enhancement.
Have a look at SR 3-11193675871 if you can. I’ll email you the information from the SR if it helps. Maybe you can help me get through to the right person over at Oracle.
The original patch has been superseded by a broader patch that enables parallel import of other indexes (such as bitmap indexes) as well. Look for the patch for bug 21539301. The page for patch 18793090 was already updated with this information.
Even after the patch 22273229 , I’m seeing a single index getting created using parallel query processes.
Here is our scenario. We have broken down the schema export/import into separate jobs. So during import time, we are running multiple IMPDP commands at the same time.
So if we add PARALLEL 8 to each of the IMPDP commands, and then during index creation, each index is getting created using PX sessions, it overwhelms the system.
Naveen, pelase ensure that the patch has been applied to the database (by running datapatch) in addition to installing it in the Oracle Home. The most common reason for the behavior you describe is that the patch was installed with opatch, but datapatch was not run on the database.
Hi Mike, are pleased to greet you commented that you install the patch that substituted for the version 220.127.116.11, 22273229.
The installation was carried out on an AIX platform, today at dawn execute the process and you will comment the results.
Thank you for sharing your knowledge.
I was importing a 700 Gb dumpfile ,and after applying this patch I was able to save 2 hours ( from 6 to 4 ) for the full import. Not sure if the parallelism is completely out . For tables I prefer PX processes , since I have larger tables and for indexes/constraints this is good.
there are cases when the fix is a big improvement (many smaller indexes) vs cases where the fix doesn’t help (small number of indexes but all very large).
Thanks for your feedback and for sharing your experience!
Does it mean that if, for instance, I have a big table (1Tib) with multiple indexes over it and a couple of not big tables with indexes, then better to run IMPDP without patch for for huge table and indexes, then patch the system and finally run IMPDP for other tables and indexes with patch?
And you say “Through extensive performance testing we found that it is faster to create multiple indexes in parallel (using a parallel degree of 1)”
Is it true for multiple indexes that are based on one table?
there’s unfortunately no clear YES/NO answer to your question. It depends a lot.
For instance, at a recent project we went without the patch on purpose as we’ve had only a few massive indexes – and the parallel-in-one-index approach went faster than having several indexes built in parallel. It really really really depends on the situation.
Thank you for this post, it helps! Just wanted to point out that – I noticed that when there are very large indexes being imported, the parallelism automatically changed to less # as compared to when it was importing a bunch of smaller indexes.
Seems this patch does not work with NETWORK_LINK indexes import. There is still one worker.
yes, I think this is a general limitation. Workaround would be to exclude indexes and build them via a script.
With NETWORK_LINK the indexes are been created serially, and with PARALLEL 1. A big import I have done recently took 25 hours. 7 hours to process data (all tables started at the same time, and with parallel impdp workers, as searched into ASH) and 18 hours to process indexes (one before the other, with PARALLEL 1).
This is the worst case. Any workaround to this without creating and editing a SQLFILE ?
what are your source and destination versions?
I still prefer the old approach when it comes to large index where indexes are getting created sequentially using parallel clause. It’s much faster to do parallel scans.
Apart from performance issue, one big issue with multiple large indexes getting created simultaneously is excessive use of temporary tablespace. You will need to allocate temp space equivalent to size of top n index (n => value set for parallel) to safely ensure the index creation will succeed.
Looks like in 19c (actually 12.2 onward) this simultaneous index creation is the default behaviour. Is there are parameter where I can get the old behaviour back on 19c?
as far as I am aware, there is no way to turn this off or revert. The only thing I haven’t tried is to disable the fix 22273229 with an _fix_control=”22273229:0″ setting in the SPFILE.
Unfortunately, this bug no. is not visible in v$system_fix_control in 19c. Looks like they have just baked this feature from 12.2 onward and was installable as a patch only on 18.104.22.168 and 12.1.
I don’t think this is going to work as visibility of the patch in v$system_fix_control is one of the pre-requisite to using “_fix_control”.
Since it’s not visible in 22.214.171.124 as well even though the patch is applied I don’t think this method will work in 126.96.36.199 either but at least the patch could be rolled back.
./opatch lsinventory|grep -i applied|grep -i 22273229
Patch 22273229 : applied on Sat Jun 15 11:49:52 GMT 2019
SQL> select * from v$system_fix_control where bugno=22273229;
no rows selected
I am trying to get some workaround via Oracle SR. Keeping my fingers crossed.
Ok – thanks for checking 🙁
Same problem here: after upgrading from 188.8.131.52 to 19c critical datapump jobs take 2-3x the amount of time than before. Drilldown showed that reason was that indexes were created with PARALLEL 1 and no parallel slave processes.
See SR for more information. Currently waiting for feedback from Development.
SR 3-26429350311 : DataPump Import jobs take 2 times longer in 19c PDB compared to 184.108.40.206 nonCDB
Bill will check and get back to you.
I have similar SR (3-26384234171 : Oracle impdp – import indexes with parallel > 1 in 19c) raised and again waiting for the feedback from development. I hope they can give a patch quickly to revert the behaviour back to pre 12.2 version. The new method just isn’t fit for large tables. And then there are other problems related to increased temp tablespace usage, etc.
Unfortunately this is way more complicated then you would think.
It is discussed actively internally – but this is not something which will happen within the next weeks for sure.
Sorry, but this is a no brainer situation. The current behavior does not work for big tables: buffer contention will kill performance. This parallel 1 approach is even worse considering big tables are normally partitioned, improving real parallel access.
This sounds like a good idea with a poor implementation. Glad it is being discussed actively, hope to see it improved.
I know – but I fear there won’t be any change before 23c. Only w/a is to create indexfiles instead and run them the way you’d prefer.
Thanks for hosting this blog. It’s one of the more informative Oracle blogs I follow.
I am currently slogging through a 20TB E-Business Suite database upgrade, platform migration, and character set conversion from 12c to 19c via export/import. The index build on several large tables (billions of rows) is dragging the migration down by a good 24 hours. 99% of the indexes are finished, and ~4 of the 72 Data Pump Workers are running single-threaded index builds on the large tables.
Having skimmed through this thread, I will try the EXCLUDE=INDEX:”LIKE ‘MY_INDEX_PREFIX%'” clause and rebuild the large indexes manually next time. Maybe that will cut our upgrade window down by a day.
it will – and this was the main discussion topic during the Data Pump Virtual Classroom Seminar.
I would like to use impdp to import only index with parallel with the following parameters. Is that correct?
What happen if an index already exist?
you most likely receive an ORA-31684 when the index exists already.
And “PARALLEL=8” will allow several workers to build multiple indexes in parallel.
An alternative to have more control is an indexfile where you have one or more SQL files generated as test files by Data Pump, and then you tune it per need.
It seems that the parallelisation doesn’t work for Indexes that are created for constraints. My import script has parallel set to 16. The Indexes have a degree of 32. But only one worker is active and creates the constraint-indexes one after the other. Each index is created with a degree of 1.
Is there a way to speed up the constraint-index creation?
let me check that. Could you send me your par and especially the log file please?
Here is the import par file. Can I send you more info directly? I don’t want to clutter the comment section with just my problem.
## create or replace directory DP_TRANS_ISY as ‘/nas/trans/DMP/CISY’;
## export ORACLE_PDB_SID=ISYUTF8; impdp parfile=imp_schema_mvr_ISYMV_SCHEMA.par
USERID = ‘/ as sysdba’
directory = DP_TRANS_ISY
dumpfile = exp_schema_mvr_ISYMV_SCHEMA%U.dmp
logfile = imp_schema_mvr_ISYMV_SCHEMA.log
logtime = all
parallel = 32
# full = y
# content = data_only
exclude = statistics
metrics = y
Here is the interesting part of the log file where you can see that the indexes and functional indexes have been imported by all of the dp workers, while the constraints and foreign key constraints have been imported only by worker 1:
01-MAR-23 13:17:28.043: W-6 Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX
01-MAR-23 14:04:20.206: W-9 Completed 149 INDEX objects in 2811 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 1 3 INDEX objects in 2447 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 2 5 INDEX objects in 2624 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 3 4 INDEX objects in 2543 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 4 3 INDEX objects in 2404 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 5 3 INDEX objects in 2566 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 6 2 INDEX objects in 2461 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 7 7 INDEX objects in 2581 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 8 19 INDEX objects in 2623 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 9 2 INDEX objects in 2373 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 10 12 INDEX objects in 2443 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 11 26 INDEX objects in 2811 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 12 11 INDEX objects in 2632 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 13 7 INDEX objects in 2507 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 14 31 INDEX objects in 2643 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 15 9 INDEX objects in 2599 seconds
01-MAR-23 14:04:20.206: W-9 Completed by worker 16 5 INDEX objects in 2504 seconds
01-MAR-23 14:04:20.218: W-9 Processing object type SCHEMA_EXPORT/TABLE/INDEX/FUNCTIONAL_INDEX/INDEX
01-MAR-23 14:19:35.181: W-1 Completed 2 INDEX objects in 914 seconds
01-MAR-23 14:19:35.181: W-1 Completed by worker 4 1 INDEX objects in 914 seconds
01-MAR-23 14:19:35.181: W-1 Completed by worker 10 1 INDEX objects in 43 seconds
01-MAR-23 14:19:35.184: W-1 Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/CONSTRAINT
01-MAR-23 17:22:50.136: W-6 Completed 75 CONSTRAINT objects in 10994 seconds
01-MAR-23 17:22:50.136: W-6 Completed by worker 1 75 CONSTRAINT objects in 10994 seconds
01-MAR-23 17:22:50.138: W-6 Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT
01-MAR-23 17:22:51.440: W-16 Completed 12 REF_CONSTRAINT objects in 0 seconds
01-MAR-23 17:22:51.440: W-16 Completed by worker 1 12 REF_CONSTRAINT objects in 0 seconds
sorry for the VERY late reply. We discussed this internally weeks ago already but I did not follow up yet with you. Please always feel free to mail me directly since we worked together for such a long time.
“Constraint indexes (primary keys, anyway, which are implemented as unique indexes) are not built at the same point as the other indexes, and in fact Data Pump isn’t the entity directing creation of those indexes. PK indexes are also not excluded if you EXCLUDE=INDEX, because they are table objects, not index objects.
We do import other constraints in parallel, as we do indexes, since 220.127.116.11. But I think that PK constraints are not done that way because they are created when the table is built. That’s my recollection, but if there is a logfile showing what he is seeing, that might help.”
I shared your log with Bill and Roy. If something strange is shown, I will follow up with you via email.