What does datapatch do when you have different components in PDBs?

The last weeks, literally weeks, I have spent a lot of time with several patching topics. And in order to clean up my inbox a bit I will dump the learning into several blog posts. This one today will be about an interesting case I came across when a customer ran into problems with the runtime of datapatch in an ExaCC environment. Hence, today’s topic will be: What does datapatch do when you have different components in PDBs?

Annotation

If you are on 19.16.0 or higher, than you can safely ignore this blog post since the fix we discuss below has been included from 19.16.0 onward.

Different components?

In a Multitenant environment you can choose various strategies for database components. When you create your CDB to start with, the DBCA offers you some choices as soon as you enter the “Advanced Configuration” and “Custom Database” path.

Many customers chose the following strategy:

  1. Have all components in CDB$ROOT
  2. Have only the minimum set of components in PDB$SEED

This strategy leads to the situation that a fresh PDB has only a minimum set of components. But potentially required components can be installed afterwards when needed. And it allows also to plug in any non-CDB since all the components exist in CDB$ROOT, and hence no conflicts will be raised.

When you compare now the component sets in CDB_REGISTRY (and ensure in every Multitenant environment that _exclude_seed_cdb_view=false as otherwise you won’t see the components in PDB$SEED) you will find an output such as this one:

SQL> select con_id, comp_id, comp_name from cdb_registry order by 1,2;

CON_ID COMP_ID	  COMP_NAME
------ ---------- --------------------------------------------------
     1 CATALOG    Oracle Database Catalog Views
     1 CATJAVA    Oracle Database Java Packages
     1 CATPROC    Oracle Database Packages and Types
     1 JAVAVM     JServer JAVA Virtual Machine
     1 OLS        Oracle Label Security
     1 ORDIM      Oracle Multimedia
     1 OWM        Oracle Workspace Manager
     1 RAC        Oracle Real Application Clusters
     1 XDB        Oracle XML Database
     1 XML        Oracle XDK
     2 CATALOG    Oracle Database Catalog Views
     2 CATPROC    Oracle Database Packages and Types
     2 RAC        Oracle Real Application Clusters
     2 XDB        Oracle XML Database
     3 CATALOG    Oracle Database Catalog Views
     3 CATPROC    Oracle Database Packages and Types
     3 RAC        Oracle Real Application Clusters
     3 XDB        Oracle XML Database
     4 CATALOG    Oracle Database Catalog Views
     4 CATPROC    Oracle Database Packages and Types
     4 RAC        Oracle Real Application Clusters
     4 XDB        Oracle XML Database

The CDB$ROOT has more components than PDB$SEED and the PDBs.

What does this has to do with datapatch?

Now you may ask yourself: How is this related to patching, and especially datapatch?

And I confess I wasn’t aware either that there is (currently) a relation. The customer case I was involved in for several weeks started as a complaint about datapatch runtimes. With the change from 19.12.0 to 19.13.0 and the application of several one of patches, some of them related to Data Pump, the patching took significantly longer. For the Data Pump topic you may read another blog post in the following days.

But when we started digging into the log files something I was not aware of became clear:

datapatch takes a different route when you have different components in PDBs compared to CDB$ROOT.

And this different route leads to a loss of parallel capabilities. I know, this sounds a bit weird but read on if you are interested in the gory details. And a comment upfront: The datapatch team is already working on a solution. It is just not as simple as it may look like at first sight.

The log structure of datapatch

Let me start this journey with the log files. I can tell you: datapatch writes a lot of logs. In an environment with a lot of databases you get easily some GBs over time on your machine. You will find all the datapatch logs under: $ORACLE_BASE/cfgtoollogs/sqlpatch

I know that some of you may now logon to your database servers and check the size of this directory right away. But keep in mind before wiping it out: When you open an SR complaining about patch runtimes it is essential to have the logs. You may complain day-long about slow datapatch – but the details are all in here. Hence, in such cases you may need to zip the entire directory and upload it to MOS.

The structure of this subdirectory looks like this:

$  find . -maxdepth 1 -printf '%TY-%Tm-%Td %TH:%TM %f\n'
2022-01-19 23:16 .
2019-04-18 23:27 sqlpatch_26498_2019_04_18_23_27_00
2022-01-19 23:16 sqlpatch_history.txt
2019-04-18 23:27 29314339
2019-04-18 23:59 sqlpatch_30226_2019_04_18_23_59_24
2019-04-28 00:51 sqlpatch_27456_2019_04_28_00_50_15
2019-04-28 00:50 29517242
2019-10-16 19:42 sqlpatch_8582_2019_10_16_19_40_28
2019-10-16 19:40 30125133
2019-10-16 20:08 sqlpatch_14343_2019_10_16_20_07_36
2019-10-16 20:07 30138470
2019-10-16 20:11 sqlpatch_15340_2019_10_16_20_09_24
2020-01-21 21:23 sqlpatch_16010_2020_01_21_21_20_48
2020-01-21 21:21 30557433
2020-01-21 21:44 sqlpatch_20635_2020_01_21_21_43_12
2020-01-21 21:43 30593149
2020-01-21 21:48 sqlpatch_21868_2020_01_21_21_47_13
2020-04-14 23:50 sqlpatch_12047_2020_04_14_23_49_37
2020-04-14 23:50 30886680
2020-04-14 23:53 sqlpatch_13154_2020_04_14_23_51_50
2020-04-15 00:17 sqlpatch_18226_2020_04_15_00_14_23
2020-04-15 00:16 30869156
2020-07-15 10:29 sqlpatch_13501_2020_07_15_10_25_31
2020-07-15 10:27 31281355
2020-07-15 10:55 sqlpatch_21542_2020_07_15_10_54_08
2020-07-15 10:55 31312468
2020-07-15 11:02 sqlpatch_23127_2020_07_15_11_00_12
2020-10-21 11:33 sqlpatch_9132_2020_10_21_11_32_20
2020-10-21 11:33 31741641
2020-10-21 11:35 sqlpatch_10552_2020_10_21_11_34_42
2020-10-21 11:56 sqlpatch_15467_2020_10_21_11_52_34
2020-10-21 11:55 31771877
2021-01-20 00:33 sqlpatch_11395_2021_01_20_00_33_39
2021-01-20 00:33 32228578
2021-01-20 00:35 sqlpatch_12610_2021_01_20_00_35_03
2021-01-20 01:04 sqlpatch_19749_2021_01_20_01_00_14
2021-01-20 01:00 32218454
2021-04-21 14:43 sqlpatch_2302_2021_04_21_14_40_26
2021-04-21 14:40 32545013
2021-04-21 15:09 sqlpatch_7267_2021_04_21_15_09_24
2021-04-21 15:09 32507738
2021-04-21 15:15 sqlpatch_8871_2021_04_21_15_14_28
2021-08-09 16:06 sqlpatch_16234_2021_08_09_16_04_00
2021-08-09 16:04 32904851
2021-12-15 23:32 sqlpatch_25729_2021_12_15_23_29_47
2021-12-15 23:30 33192793
2021-12-16 00:10 sqlpatch_30410_2021_12_16_00_09_28
2021-12-16 00:09 33192694
2021-12-16 09:08 sqlpatch_13274_2021_12_16_09_08_23
2021-12-16 09:08 33261817
2021-12-16 09:13 sqlpatch_14945_2021_12_16_09_13_21
2022-01-19 21:48 sqlpatch_29286_2022_01_19_21_46_51
2022-01-19 21:47 33515361
2022-01-19 22:14 sqlpatch_485_2022_01_19_22_13_40
2022-01-19 22:14 33561310
2022-01-19 22:25 sqlpatch_4600_2022_01_19_22_25_31
2022-01-19 22:25 33587128
2022-01-19 22:31 sqlpatch_6012_2022_01_19_22_30_32
2022-01-19 23:18 sqlpatch_12220_2022_01_19_23_16_38

This is pretty wild – but you may recognize subdirectories with only a number (e.g. 33587128) and subdirectories starting with “sqlpatch” (e.g. sqlpatch_12220_2022_01_19_23_16_38). The above listing tells you the datapatch history in my 19c environment back until April 2019. Subdirectories having “sqlpatch” in the name contain the datapatch execution information for a specific database:

├── sqlpatch_485_2022_01_19_22_13_40
    ├── install1.sql
    ├── sqlpatch_autorecomp_CDBROOT.log
    ├── sqlpatch_autorecomp_PDBSEED.log
    ├── sqlpatch_catcon_0.log
    ├── sqlpatch_catcon__catcon_485.lst
    ├── sqlpatch_debug.log
    ├── sqlpatch_invocation.log
    ├── sqlpatch_progress.json
    └── sqlpatch_summary.json

While the number-only subdirectories contain a subdirectory with the patch-id, and in this you will find the logs for each container in case it is a Multitenant environment, e.g.:

[CDB2] oracle@hol:/u01/app/oracle/cfgtoollogs/sqlpatch/33515361/24589353
$ ls
33515361_apply_CDB2_CDBROOT_2022Jan19_21_47_29.log
33515361_apply_CDB2_PDBSEED_2022Jan19_21_47_54.log
33515361_apply_UP19_2022Jan19_23_17_14.log
33515361_ru_apply_CDB2_CDBROOT_2022Jan19_21_47_28.log
33515361_ru_apply_CDB2_PDBSEED_2022Jan19_21_47_53.log
33515361_ru_apply_UP19_2022Jan19_23_17_13.log

The important log files in our case are the sqlpatch_invocation.log and the installn.sql,

In the sqlpatch_invocation.log we will find the actual patching queue, for instance the patches to be rolled back and applied but also the patching queue. And this patching queue tells us whether a containers in a Multitenant environment will be patched in parallel, or just serially.

Patching containers in parallel, or just serially?

What we found in this customer case were different patching queues noted in the sqlpatch_invocation.log.

Fast datapatch runs looked like this:

[2022-01-13 18:41:59] Checking prereqs for queue entry for PDBs CDB$ROOT,PDB$SEED,MYPDB1

While slow runs had such queues in the sqlpatch_invocation.log:

[2022-01-13 18:40:16] Checking prereqs for queue entry for PDBs CDB$ROOT,PDB$SEED
[2022-01-13 18:40:59] Checking prereqs for queue entry for PDBs MYPDB2

You may wonder now: What is the problem?

But at the same time you will find either a single install1.sql in the directory (when you have only a single patch queue) or multiple files, e.g. install1.sql and install2.sql.

So without even looking at your sqlpatch_invocation.log, you can diagnose by yourself:

  1. You have only a single install1.sql in the directory ==> same components in all container
  2. You have an install1.sql and an install2.sql in the directory ==> The PDB(s) have less components than CDB$ROOT
  3. You have more than 2 installn.sql in the directory ==> The PDB(s) have different components between PDBs and less components than CDB$ROOT

But why is this of any importance?

Normally you would expect datapatch action in parallel in a Multitenant environment. And normally, it does work in parallel. So in case (a.) above you will find out that datapatch runs the required actions at first in the CDB$ROOT, and then in the PDBs including the PDB$SEED in parallel. This would lead for example to:

--worker0--CDB$ROOT
     |
     |–-worker0––-PDB$SEED
     |––worker1--–MYPDB1
     |––worker2--–MYPDB2

This means, once worker 0 is done with CDB$ROOT, it can now progress with PDB$SEED while the additional workers will take care on the PDBs at the same time. And in case you’d ask yourself how many workers may be started, this depends on the number of cores. The parallel degree algorithm is determined at first by the underlying catcon.pm. But datapatch may throttle it since on systems with a lot of CPUs and many PDBs you may end up with too many parallel workers. In AutoUpgrade we do a similar thing and prevent exceeding the number of available processes.

Back to reality – what did we get with different components in the PDBs compared to CDB$ROOT?

--worker0--CDB$ROOT
      |
      |–-worker0––-PDB$SEED---MYPDB1---MYPDB2
      |––worker1--–

This looks … non-parallel but serially.

So this is what happens when you have PDBs with various components but not the same set as CDB$ROOT.

Is this a bug?

Of course, the first question you may have is whether this is a bug.

It isn’t. But it clearly isn’t ideal.
Hence, let me clarify why this happens.

The reason for this behavior are different component sets in PDBs compared to CDB$ROOT. When datapatch needs to apply the SQL and PLSQL changes from an RU or RUR, it looks for components. The idea of this check is simply that datapatch shouldn’t waste time to run for instance a SDO script when you don’t have SDO in your database.

Hence, generally databases with less options will be not only upgraded faster but also patched faster since less scripts potentially will be run.

Let us check the sqlpatch_invocation.log again since it will tell you the facts:

CDBSLOW 
  'patch_string' => 'IR: 32876380/24269510 29213893/24262223 29309073/24433914 33027539/24390892 33144001/24351857 33502645/24472857 IA: 33192694/24421575 29213893/24384541 29309073/24493062 33721661/24575622 RI: 33192793/24462514 19.12.0.0.0 Release_Update 210716141810 NO: ',
    'pdbs' => [
      'CDB$ROOT',
      'PDB$SEED'
    ]
  'patch_string' => 'IR: 32876380/24269510 29213893/24262223 29309073/24433914 33027539/24390892 33144001/24351857 33502645/24472857 IA: 33192694/24421575 29213893/24384541 29309073/24493062 33721661/24575622 RI: 33192793/24462514 19.12.0.0.0 Release_Update 210716141810 NO: ',
    'pdbs' => [
      'PDB$SEED'
    ]
  'patch_string' => 'IR: 32876380/24269510 29213893/24262223 29309073/24433914 33027539/24390892 33144001/24351857 33502645/24472857 IA: 33192694/24421575 29213893/24384541 29309073/24493062 33721661/24575622 RI: 33192793/24462514 19.12.0.0.0 Release_Update 210716141810 NO: CONTEXT,DV,OLS,SDO,XOQ',
    'pdbs' => [
      'MYPDB1'
    ]

This is from a slow run where everything happens now serially. The indicator is the missing PDB(s) at the end of the first patch_string in the PDB list. There you see the flag “NO: CONTEXT, DV, OLS, SDO, XOQ” which means that these components are in CDB$ROOT and PDB$SEED but not in MYPDB1.

Just compare this to a fast run where things get parallelized instead:

CDBFAST
    'patch_string' => 'IR: 32876380/24269510 29213893/24262223 29309073/24433914 33027539/24390892 33144001/24351857 33502645/24472857 IA: 33192694/24421575 29213893/24384541 29309073/24493062 33721661/24575622 RI: 33192793/24462514 19.12.0.0.0 Release_Update 210716141810 NO: ',
    'pdbs' => [
      'CDB$ROOT',
      'PDB$SEED',
      'MYPDB1'
    ]
    'patch_string' => 'IR: 32876380/24269510 29213893/24262223 29309073/24433914 33027539/24390892 33144001/24351857 33502645/24472857 IA: 33192694/24421575 29213893/24384541 29309073/24493062 33721661/24575622 RI: 33192793/24462514 19.12.0.0.0 Release_Update 210716141810 NO: ',
    'pdbs' => [
      'PDB$SEED'
    ]
    'patch_string' => 'IR: 32876380/24269510 29213893/24262223 29309073/24433914 33027539/24390892 33144001/24351857 33502645/24472857 IA: 33192694/24421575 29213893/24384541 29309073/24493062 33721661/24575622 RI: 33192793/24462514 19.12.0.0.0 Release_Update 210716141810 NO: ',
    'pdbs' => [
      'MYPDB2'
    ]

There is an empty string in the “NO: ” flag. The components are all the same everywhere, nothing needs to be excluded.

The topic gets tracked and fixed under non-public BUG 33089192 – MOVE SET_FILE_METADATA FROM CHECK_QUEUE_PREREQS. The datapatch team is working on it but I can’t give you an estimate.

If you are on 19.16.0 or higher, than you can safely ignore this blog post since the fix we discuss below has been included from 19.16.0 onward.

Workaround

How can you workaround this behavior?

When you use opatchauto, there is no workaround for it since opatchauto is an orchestration which does not allow you to skip any step. And even calling datapatch after a regular opatch run does not help since it will build up its queue the same way.

So basically, the real solution until datapatch chooses a better strategy will be having the same components installed in all containers. And I know that this isn’t a real workaround we may discuss here.

In consequence, there is no real workaround available.

Does this effect ordinary patches as well?

You most likely won’t see this pattern with ordinary patches since those typically don’t access various components. I saw it only with RUs and RURs.

Summary

The topic I’m writing about is only of interest for you when you have a Multitenant environment. In case you have PDBs with a different component set in CDB_REGISTRY than the CDB$ROOT, datapatch will build up independent patch queues to avoid running patching scripts which may apply to a component which then does not exist in one or many PDBs. In such cases unfortunately, datapatch now executes the patching serially, one container after another, instead of many PDBs in parallel with independent workers. Hence, in such cases you won’t benefit from the parallel capabilities of your system. And the datapatch run may take longer.

This effect becomes most visible in single tenant environments where you’d expect PDB$SEED and PDB to be treated in parallel but instead get now patched one after another. CDB$ROOT gets patched always at first, there are no parallel activities expected.

Datapatch will be enhanced in the future to accommodate such situations and leverage the parallel capabilities.

Further Links and Information

–Mike

Share this: