Can you use the UEK7 Linux kernel, or may you get some trouble?

Writing about a topic I have not much knowledge with is … no fun 🙂 But let me share this with you as one of my most trusted customers, Peter Lehmann, tested this and got trapped. The information being available was not crystal clear, and Peter‘s database did not start anymore when HUGEPAGES were enabled. So, the question is: Can you use the UEK7 Linux kernel, or may you get some trouble?

Where it all starts

Peter set up a new cluster with Oracle Linux 8.8 using an UEK7 kernel. And as soon as he enabled HUGEPAGES, the database hung in MOUNT status with no sign of what is blocking it from starting up. Deactivating HUGEPAGES lead to immediate relief, the database did start up nicely as intended. But enabling HUGEPAGES lead to the same situation again. Booting with an RHEL kernel worked fine, too. But this wasn’t the desired solution.

I know that Peter is precise, and he reads our docs carefully. He came across this the Supported Oracle Linux 8 distributions for x86-64 where it says:

Minimum supported versions:

  1. Oracle Linux 8.1 with the Unbreakable Enterprise Kernel 6: 5.4.17-2011.0.7.el8uek.x86_64 or later
  2. Oracle Linux 8 with the Red Hat Compatible Kernel: 4.18.0-80.el8.x86_64 or later

Note:

Oracle recommends that you update Oracle Linux to the latest available version and release level.

Now reading “Minimum supported versions” as “Unbreakable Enterprise Kernel 6: 5.4.17-2011.0.7.el8uek.x86_64 or later” read to him (and I confess, to me too) as “everything newer“. And this would obviously cover UEK7 kernels, too.

At this point, Oracle Support pushed back stating that this stance applies to UEK6 kernels only, and does not cover UEK7 kernels. Those in fact are not supported (yet). I’ve had heard something similar before but couldn’t pin it down.

 

Testing it

Peter then tested again, and installed the UEK6 ( 5.4.17-2136.325.5.el8uek.x86_64 ). He did boot with this kernel and the database started nicely as expected. Changing the boot kernel then to UEK7 (5.15.0-101.103.2.1.el8uek,5.15.0-200.131.27.el8uek) did not allow the database to startup using HUGEPAGES.

After some additional back and forth with Oracle Support, Peter pointed to my colleague Simon Coter’s blog post listing all the various kernels in a table: Oracle Linux and Unbreakable Linux Kernel Releases where UEK7 is displayed as being supported for OL 8.8.

Can you use the UEK7 Linux kernel, or may you get some trouble?

Can you use the UEK7 Linux kernel, or may you get some trouble?

But … the table does not say anything about the database releases being certified with it.

In this case, MOS matters. You’d think … right? But MOS does still (as of today, Nov 27, 2023) only uses the “or later” phrase which leaves a lot of room for interpretation. Simon then added that Oracle Database 19c on OL9 indeed is supported with an UEK7 kernel. In fact, this information is visible on MOS, too:

Oracle Database 19c is certified on Oracle Linux 9 Update 0+
Minimum RU:
19.19

Minimum kernel versions:

  • Oracle Linux 9 with the Unbreakable Enterprise Kernel 7: 5.15.0-1.43.4.2.el9uek.x86_64 or later,
    or
  • Oracle Linux 9 with the Red Hat Compatible kernel: 5.14.0-70.22.1.0.2.el9_0.x86_64 or later

 

Or later?

What I learned is that “or laterdoes not meanany kernel“. It means “within this UEK version“. Hence, when it says UEK6 or later, then it means “from this exact UEK6 version on, and any other UEK6 kernel released after it“.

Lesson learned – and not only I’d wish that the information on MOS would be a bit more clear and precise to avoid such pitfalls.

And please don’t forget to check the release notes as well before you start the installation.

 

Additional Information (Jan 17, 2024)

Thanks to my colleague Kamil Budinsky from ACS/CSS in Oracle Slovakia who did some tests and research I would like to update this blog post with some useful information and findings Kamil had. Thanks Kamil for sharing this with me!

I tested also UEK7 kernel with 19c database on OEL8 on my virtualbox laptop setup. On single instance 12.1, 19c and 21c databases with OEL8 UEK7 kernel worked all versions fine. With datafiles on filesystem. All were using huge pages, I’m using USE_LARGE_PAGES=AUTO_ONLY on my test machines.

I then installed UEK7 to my 19c OEL8 RAC cluster and soon after 19c rac db was started up, I was getting virtual machine reboot.

The problem with the UEK7 kernel support lies in the minimum versions and dependent features and products such as ACFS, AFS, ASMlib.

ACFS/AFD support (certification matrix) is published in ACFS and AFD Support On OS Platforms (Certification Matrix). (Doc ID 1369107.1). And here it is listed, that with OEL8 is UEK7 kernel 5.15.0 supported, starting with 19.19RU:

Oracle Linux 8 – All Updates, 5.15.0-6.80.3.1 and later UEK 5.15.0 kernels – 19.19.0.0.230418ACFSRU

So if ACFS/AFD is supported with UEK7 5.15.0 kernel, database should be also, or not?

I spend hours trying to find out why I’m getting crash under UEK7 kernel with RAC. Nothing useful in virtualbox log files, grid alert.log file, ocssd logs. Then I found /var/crash directories with vmcore-dmesg.txt file, in which I saw

[  159.215691] BUG: unable to handle page fault for address: ffffe873ca00d034
[  159.215709] #PF: supervisor write access in kernel mode
[  159.215719] #PF: error_code(0x0003) - permissions violation

and the stacktrace containing afd modules:

[  159.216060]  afdq_request_drop+0x127/0x140 [oracleafd]
[  159.216074]  afdq_batch_collect.isra.0+0x256/0x2a0 [oracleafd]
[  159.216438]  afdq_batch_reap+0x81/0x230 [oracleafd]
[  159.217018]  afdc_io+0x629/0x1000 [oracleafd]

Searching our MOS for “afdq_request_drop 5.15.0” I found two bugs:

  • Bug 35455593 : 23CBETA2: SERVER REBOOT USING DBCA OR RMAN RESTORE
  • Bug 33677393 – Build OL8 UEK7 Compatible ACFS, AFD Modules (kernel 5.15.0-6.80.3.1 and later)(Doc ID 33677393.8)

The cause may be that it worked with 5.15.0-0.30.19 kernel but failed with a later version 5.15.0  as soon as  huge pages were used.

New afd drivers will be build under bug 33677393.

Let’s hope that they’ll be included in the January RU.

Until it is fixed, if we are using AFD with RAC, we should use UEK6 kernel. Or UEK7 without huge pages, which for large SGA is not recommended. I don’t know if Peter is also using AFD, but there is mentioned cluster, so it may be ASM with AFD [Mike: Yes, this is the case].

If he was using asmlib driver, then it was included in UEK kernels up to UEK6. In UEK7 it is no longer included, as oracleasm-support will use new io_uring kernel call insted of oracleasm kernel module.

And maybe final confirmation that my current 19c grid install supports UEK7 kernel:

[root@r191 ~]# find /u01/app/19.0.0.0/grid -name "oracleafd.ko"
/u01/app/19.0.0.0/grid/usm/install/Oracle/EL7UEK/x86_64/4.1.12-112.16.4/4.1.12-112.16.4-x86_64/bin/oracleafd.ko
/u01/app/19.0.0.0/grid/usm/install/Oracle/EL8UEK/x86_64/5.4.17-2011.0.7/5.4.17-2011.0.7-x86_64/bin/oracleafd.ko
/u01/app/19.0.0.0/grid/usm/install/Oracle/EL8UEK/x86_64/5.15.0-0.30.19/5.15.0-0.30.19-x86_64/bin/oracleafd.ko

Best regards, Kamil

 

Additional Information (Jan 30, 2024)

Following an internal discussion I’ve been told today that there is actually an enhancement fix for 19.21 and 19.22 available which prevents the above issue, and allows you to use UEK7U2 kernels.

Please note that officially there is no OL9 certification yet as far as I can see from Oracle Support Document 1369107.1 – ACFS and AFD Support On OS Platforms (Certification Matrix).

 

Additional Information (Feb 9, 2024)

I think I need to rewrite this blog post the sooner or later if I add more “Additional Information” sections. It starts looking a bit wild. But I hope you did scroll down until you find this section. Then with 19.22.0, all seems to be fine if you read:

This note covers the issue when you apply 19.22 ACFSRU Patch 35983839 on OL8 and OL9 UEK7R2 kernels , and the patch fails to install with this error:

AFD-620: AFD is not supported on this operating system version: '5.15.0-202.135.2.el9uek.x86_64'

See the MOS note for the simple workaround to rename the old kernel, and then the patch can be installed flawlessly. Thanks to Rainer Moser for pointing me to this note, and for pushing the SR forward. Rainer confirmed that all works nicely now with 19.22 on OL9.

In addition, Rainer pointed me also to a change for clarification in the ACFS AFD Support Matrix ( Doc ID 1369107.1 ) in Note 1 :

  • Enh 35599173 – SLES15SP5 SUPPORT FOR USM (ACFS AND AFD) – KERNEL 5.14.21-150500.53.2
  • Enh 35988503 – [OL8, RHEL8] – ACFS, AFD – RHEL8.9,OL8.9 RHCK SUPPORT (4.18.0-513.EL8.X86_64 AND LATER OL8.9 RHCK AND RHEL8.9 KERNELS)
  • Enh 35697907 – [OL9, RHEL9] – 19C ACFS,AFD – OL9 UEK7, OL9.2 RHCK, RHEL9.2 SUPPORT (5.15.0-6.80.3.1 – 5.15.0-99 UEK7 AND 5.14.0-284.11.1 AND LATER OL9.2 RHCK AND RHEL9 KERNELS)
  • Enh 36028280 – [OL9, RHEL9] – ACFS,AFD – RHEL9.3, OL9.3 RHCK SUPPORT (KERNEL VERSION 5.14.0-362.8.1.EL9_3.X86_64 AND LATER OL9.3 RHCK AND RHEL9.3 KERNELS)
  • Enh 35983839 – [OL8, OL9] – ACFS, AFD – UEK7U2 SUPPORT (V5.15.0-201.135.6 AND LATER UEK7 KERNELS)
  • Enh 36114443 – [OL7,OL8] – ACFS, AFD – UEK6U3 SUPPORT (5.4.17-2136.327.1 AND LATER UEK6U3 KERNE

You see that OL8/9 and kernel information has been added to clarify more where and when you need a certain enhancement fix.

 

Further Links and Information

–Mike

Share this: