Can you select a PDB’s character set?

Can you select a PDB's character set?Since Oracle Database 12.2.0.1 a Multitenant container database can host PDBs with different character sets. This was a huge restriction in Oracle Database 12.1., at least outside the US and Canada. Here in Germany alone you may find plenty of different database character sets, for instance WE8ISO8859P1, WE8ISO8859P9, WE8ISO8859P15, WE8MSWIN1252, maybe still some old WE8DEC – and of course the unicode ones UTF8 and AL32UTF8. But the question is: Can you select a PDB’s character set when you provision a new PDB?

Mixing PDB character sets in Oracle Database 12.2.

One of the limiting restrictions in Oracle 12.1 Multitenant got lifted with Oracle 12.2: You can have PDBs with different character sets in one container database deployment. The requirement: You must create the container database with  character set AL32UTF8. This means the database character set in CDB$ROOT and therefore as well in PDB$SEED will be AL32UTF8. Once this is the case you can plugin PDBs with all sorts of character sets.

But when you provision a new PDB it will be provisioned as a clone of PDB$SEED – and of course will have the character set AL32UTF8 as well. There may be very rare situations where you want a fresh PDB with a different character set.

Changing the PDB’s character set

Can you select a PDB's character set?The tool to change a PDB’s character set is the Data Migration Assistant for Unicode (DMU). DMU is installed in all Oracle releases since Oracle 11.2.0.4 by default. Just check the $ORACLE_HOME/dmu subdirectory and start the DMU with sh dmu.sh. And find more information about the DMU here: DMU – Tips and Tricks.

But as the name indicates, the DMU is meant to change a database’s (or PDB’s) character set to Unicode. There is a option to use DMU to change to another character set with csrepair.plb. But this does not help here as one of the requirements to use csrepair.plb says: “Only repairing from single-byte to single-byte character sets or multi-byte to multi-byte character sets is allowed as no conversion of CLOB data will be attempted.”

Furthermore the old (and now unsupported) ALTER DATABASE CHARACTER SET INTERNAL_USE <new NLS_CHARACTERSET> seems to work fine at first sight, but actually doesn’t change any CLOB data. And you’ll have plenty of CLOB data in the PDB’s dictionary. Don’t use it. Your PDB won’t be in healthy state afterwards anymore.

Basically this means you can’t change a PDB’s character set from a non-unicode character set to a unicode character set which is the requirement for mixing different character sets within a single Multitenant deployment.

Potential Workaround?

Actually the only potential workaround I can think of is:

Summary

You can’t select a PDB’s character set during creation. It will always provisioned as a clone of PDB$SEED and therefore have a multi-byte character set (AL32UTF8) when you have a CDB environment being able to host different character sets in PDBs.

And as you can’t change a PDB’s character set from single-byte to multi-byte afterwards either your only workaround is to provide your own “seed” PDBs with the character set of your choice and provision new PDBs as a clone of it.

Thanks (as always) to Sergiusz Wolicki for helping me with my questions!

–Mike

Share this: