intel / intel-data-center-gpu-driver-for-openshift Goto Github PK
View Code? Open in Web Editor NEWIntel Data Center GPU Drivers for Red Hat OpenShift Container Platform
License: Apache License 2.0
Intel Data Center GPU Drivers for Red Hat OpenShift Container Platform
License: Apache License 2.0
Includes OOT Intel GPU driver release for RHOCP 4.13.10
Checklist:
The Kernel Application Binary Interface (kABI) is a set of in-kernel symbols used by drivers and other kernel modules. Currently, the general idea is to rebuild and test the Intel GPU driver container image whenever the kernel version associated with a particular OCP z stream changes. This is the safest approach. Unfortunately, it requires continuous rebuild and test efforts that can be facilitated by automation but still carries a non-zero cost. It may be possible to reduce rebuild efforts based on the theory that no rebuild is required if the kernel ABI does not change across all z streams in a particular OCP minor version X.Y.
Assuming that the driver is using the list of stable symbols for which Red Hat guarantees ABI compatibility, consider the following.
Based on RHEL KB,
The kernel-abi-stablelists packages contain reference files, /lib/modules/kabi-/kabi_stablelist_, listing interfaces provided by the kernel that are considered to be stable by Red Hat engineering. Such interfaces are safe for long-term use by third-party loadable device drivers, as well as for other purposes.
With Red Hat Enterprise Linux 7 and 8, the stablelist is valid for the particular major release. This means that once a symbol has been introduced into kABI for a particular major release, it will not be removed, nor will its meaning be changed during that kernel major release complete life cycle.
With Red Hat Enterprise Linux 9, each minor release will have a unique stablelist that is valid throughout the minor release lifecycle. For more information on this, please refer to the following knowledgebase article;
Red Hat Enterprise Linux 9 kABI Policy
Red Hat recommends recompiling kernel modules against every minor release of Red Hat Enterprise Linux.
Based on this other KB, an OCP minor version always uses a certain minor RHEL version.
RHCOS/OCP Versions | RHEL Versions |
---|---|
4.11 | RHEL 8.6 |
4.12 | RHEL 8.6 |
4.13 | RHEL 9.2 |
It would be reasonable to conclude that for OCP 4.12 based on RHEL8.6, only 1 driver container is required to support all OCP 4.12.z versions as long as the kernel ABI stays the same. Similarly, all z streams for OCP 4.13 based on RHEL9.2 would require a single driver container image.
The goal is to understand the pros, cons and the potential risk of this approach. Theoretically, it is possible to use the same driver container with different kernel version as long as the kernel ABI remains stable. It is important to note that
in very rare and special circumstances, a symbol in a kABI stablelist needs to be changed. For example, Red Hat could introduce kABI breakage when a critical security issue cannot be resolved without breaking kABI. Red Hat will inform the partners if such a situation should occur.
In general, even if rebuilds are avoided, it is reasonable to retest the existing driver container when the kernel version changes using automation to ensure compatibility and functionality.
The dockerfile copies unnecessary files and directories to the final UBI minimal based driver container image. The goal is to copy only the necessary ko files and firmware binaries to keep the driver container image size as small as possible.
Initial analysis of current driver container image shows we can safely copy all *.ko
and *.ko.xz
files under /lib/modules/4.18.0-372.46.1.el8_6.x86_64/
. The following ko files would be copied. Note: *.ko.xz
files are not shown below since there are several. The xz
extension indicates it is a compressed ko file so it takes less space than a traditional ko file.
For firmware binaries, the proposed solution is to copy only dg2*
firmware binaries since those are the only binaries that are needed for Intel Data Center GPU Flex Series. We will continue to copy the copyright license in /firmware/i915/license/
and the firmware binaries in /firmware/i915/
.
How to handle Red Hat UBI Base Image Security/CVE Vulnerability for OOT driver container Image.
According to the suggestion from KMM Operator project, Red Hat UBI-minimal base image is used to package the Intel data center gpu driver container image.
During the RH certifying process, a CVE Vulnerability was found in this base image. This vulnerability comes from curl package addressed by CVE2023-23916.
To resolve this vulnerability and pass the RH certification, we have to recreate the image by using the new UBI-minimal based image which includes the latest curl package with the CVE update.
However, From this CVE vulnerability following potential problems are worthy us to pay attention to.
Is the UBI-minimal base image is good and safe enough for the OOT driver container image?
We know the safest image is the image that only includes the necessary packages. All the unnecessary packages will potentially bring Vulnerability risk to the image. To this issue, from Intel data center gpu driver container dockerfile, it is obviously the curl package is not used at all. So is it possible to have an OOT driver-specific base image and just include the necessary packages for OOT driver container image usage? Or a minimal base image for OOT driver container users to install the necessary packages. We all know to insmod the OOT driver module, the permissive privilege needs to run the container. So any potential vulnerability might be very dangerous for the whole cluster.
Should all of the published OOT driver container images need to update the image and redo the certifying and publishing process
In this case, if some vulnerability was found in the base image, do we really need to rebuild the image and go through the certifying process, and publish a new version of the driver container image? From certifying and publishing efforts it will be a huge effort for all the published and certified images. Image all of the published OOT driver container images are based on this vulnerable base image and needs this effort.
And also upgrading the driver container image is not an easy task, the related feature is still under development in KMM project.
Do we have some way to relieve people from the efforts?
According to April-19 KMM upstream meeting, in the future, we even can only package the kernel modules without the base image. That might be the best solution to resolve this issue
The Intel GPU i915 driver has a large dependency tree. It depends on several in-tree drivers present in /lib/modules/(KERNELRELEASE)/. In the scenario where an out of tree driver depends on in-tree driver(s), the user needs to copy in-tree drivers into /opt/lib/modules/(KERNELRELEASE)/ rather than leveraging the in-tree drivers directly on the host. This is because everything is done in /opt including modprobe -d /opt (done explicitly by KMM) as well as depmod -b /opt (done in dockerfile).
The goal is to determine if we can avoid copying in-tree drivers. Only out of tree drivers should be a part of the driver container image.
This is a complex problem where a solution may not feasible. If we continue to use /opt to avoid tainting the default /lib directory, it would require copying in-tree drivers from the host by using a modules.dep file (this file would contain dependency list and would be generated after placing the out of tree drivers modules in /lib). From a KMM perspective, it would receive a driver container with exclusively out of tree drivers and then use the supplied modules.dep to copy over the necessary in-tree drivers to successfully load the out of tree driver. This solution could involve host mounting and sym links.
There are new RHEL 9.2 based GPU drivers to provision Intel GPU Flex and Max Series. Good news: the new drivers now do not have an incompatibility with ast driver. On RHEL 8.6 based OCP 4.12, ast driver needed to be unloaded or blacklisted (via machine config which triggers reboot) prior to loading out of tree GPU drivers.
Challenges:
In-tree i915 and intel_vsec drivers have to unloaded prior to loading of out of tree drivers. KMM can only unload one in-tree driver as of now. Now, it is found that we have a use case for unloading more than one in-tree driver. Short term potential solution: unload intel_vsec outside of KMM most likely using machine config.
Once the out of tree drivers are loaded, it is observed that unloading the drivers is difficult as they are always in use by GUI subcomponent i.e. framebuffer. The exact root cause is not determined but once the out of tree drivers are loaded, the GPU is actively used by a component in the system that prevents it from being unloaded. More exploration needed due to complexity to find root cause. lsof
command was used to determine what was using the driver but did not provide any additional information.
2 components have changed:
KMM has a feature available on version 1.1.1 that can be used to unload 1 in-tree driver.
We can use this feature to unload in-tree i915. We cannot unload more than one kmod. We now have a use case to unload more than 1 in-tree driver. This includes i915 and intel_vsec for now and potentially cse in future.
3 Main Drivers for GPU: i915, intel_vsec (this is a prerequisite for i915), CSE (MEI)
Out of tree drivers behavior: Loading i915 driver will load the intel_vsec driver. Unloading i915 will unload intel_vsec.
In-tree driver behavior: Loading i915 does not load intel_vsec. Unloading i915 does not unload intel_vsec.
RHEL 9.2 OCP 4.13 has a new kernel based on 5.14.z upstream kernel. This is a huge jump from RHEL 8.6 based OCP 4.12 which used 4.18.z upstream kernel.
There is an i915 and intel_vsec in-tree driver in RHEL 9.2 (not loaded by default, it is only loaded by kernel when it detects the GPU card via PCI device ID). These above 2 in-tree drivers do not support Intel GPU Flex or Max series. The in-tree i915 driver provides display support functionality for Intel Client Arc GPUs. As a result, customers will notice on dmesg the following message:
sh-5.1# dmesg | grep graphics
[ 12.385679] i915 0000:33:00.0: Your graphics device 56c0 is not properly supported by the driver in this
[ 478.732896] i915 0000:33:00.0: Your graphics device 56c0 is not properly supported by the driver in this
Intel® Data Center GPU Flex 170 -> PCI ID is 56c0.
If in-tree intel_vsec is not unloaded prior to loading out of tree i915 driver, then unknown symbol errors observed in dmesg.
3238.466900] compat: loading out-of-tree module taints kernel.
[ 3238.466931] compat: module verification failed: signature and/or required key missing - tainting kernel
[ 3238.468361] COMPAT BACKPORTED INIT
[ 3238.468362] Loading modules backported from I915-23.6.37
[ 3238.468363] Backport generated by backports.git I915_23.6.37_PSB_230425.49
[ 3239.444973] i915: Unknown symbol intel_vsec_register (err -2)
[ 3271.091366] i915: Unknown symbol intel_vsec_register (err -2)
[ 3317.364301] i915: Unknown symbol intel_vsec_register (err -2)
[ 3376.362727] i915: Unknown symbol intel_vsec_register (err -2)
When we unload the in-tree intel_vsec driver and do nothing else different, the above issue is not observed.
When you delete the KMM module CR, it unloads the out of tree i915 driver via a PreStop Hook, but it does not reload the in-tree i915 driver. This is by KMM design. Essentially, the kernel is tainted. When KMM tries to clean up, it is unable to unload the out of tree i915 driver as it says it is in use.
We are also unable to manually unload the out of tree i915 or intel_vsec driver.
sh-5.1# modprobe -rv intel_vsec
modprobe: FATAL: Module intel_vsec is in use.
sh-5.1# modprobe -rv i915
modprobe: FATAL: Module i915 is in use.
lsmod output after out of tree drivers loaded, keep an eye on the resource counts which is the 3rd column.
sh-5.1# lsmod | grep i915
i915 3977216 4
intel_vsec 20480 1 i915
intel_gtt 24576 1 i915
compat 24576 2 intel_vsec,i915
video 61440 1 i915
drm_display_helper 172032 2 compat,i915
cec 61440 2 drm_display_helper,i915
i2c_algo_bit 16384 2 ast,i915
drm_kms_helper 192512 5 ast,drm_display_helper,i915
drm 581632 7 drm_kms_helper,compat,ast,drm_shmem_helper,drm_display_helper,i915
sh-5.1# lsmod | grep intel_vsec
intel_vsec 20480 1 i915
compat 24576 2 intel_vsec,i915
It has been noted to document a dependency list diagram for out of tree GPU drivers as a future exercise.
Summary:
The build argument $KERNEL_FULL_VERSION
was understood to be populated automatically by KMM in dockerfile. It was determined that $KERNEL_FULL_VERSION
is not populated in the dockerfile but it is present in the KMM module. Currently ${KERNEL_VERSION}
is only available as a default build argument automatically populated by KMM in dockerfile. An issue has been created on KMM upstream to ensure that KMM provides $KERNEL_FULL_VERSION
as a default build argument to avoid user confusion and maintain consistency.
Impact:
Analysis shows that $KERNEL_FULL_VERSION
is empty string when the variable is echoed in dockerfile. Since the variable is empty, that gives COPY --from=builder /lib/modules/ /opt/lib/modules/
.
This copies everything under /lib/modules
in the builder image into the final image.
We also noticed this in the build log: [Warning] one or more build args were not consumed: [KERNEL_VERSION]
.
Short-term solution:
Add this in the 2nd stage of the dockerfile:
ARG KERNEL_VERSION
ARG KERNEL_FULL_VERSION=${KERNEL_VERSION}
Use ${KERNEL_FULL_VERSION}
as needed in the 2nd stage.
Long-term solution:
KMM will add and automatically populate the $KERNEL_FULL_VERSION
as a default build argument in the dockerfile. With this, user can use $KERNEL_FULL_VERSION
as a default build argument in the dockerfile and KMM module. This would be available in a later KMM release, most likely KMM 1.1 or later.
Once this is available, we will make the following change:
Before:
ARG KERNEL_VERSION
ARG KERNEL_FULL_VERSION=${KERNEL_VERSION}
After:
ARG KERNEL_FULL_VERSION
Includes OOT Intel GPU driver release for RHOCP 4.14.0 and above
Checklist:
MEI warnings were observed in dmesg when out of tree i915 driver is loaded on RHEL 9.2 based OCP 4.13.10. Note, MEI is the original name of the driver. The out of tree driver equivalent is called CSE. Refer to dmesg output below. The goal of this issue is to understand if these warnings are expected, what is the meaning behind the warnings, and what is the root cause and potential solution.
This behavior may be expected as we are not unloading the in-tree CSE (aka MEI) driver. As a result, the out of tree MEI driver is never loaded and thus the in-tree MEI is potentially trying to use out of tree FW to initialize. This is an incompatibility.
Potential solution is to unload in-tree MEI and see if the out of tree MEI loads successfully and this error goes away. This solution needs to be tested. This is another potential use case for KMM to unload more than one in-tree driver.
### dmesg output:
[ 3567.884499] intel_vsec 0000:38:00.0: enabling device (0140 -> 0142)
[ 3567.885148] intel_vsec 0000:3d:00.0: enabling device (0140 -> 0142)
[ 3569.550055] [drm] I915 BACKPORTED INIT
[ 3569.551787] i915 0000:37:00.0: [drm] GT count: 1, enabled: 1
[ 3569.551829] clipped [mem 0x000a0000-0x000bffff] to [mem 0x00100000-0x000bffff] for e820 entry [mem 0x0009f000-0x000fffff]
[ 3569.551840] clipped [mem 0x000c8000-0x000cffff] to [mem 0x00100000-0x000cffff] for e820 entry [mem 0x0009f000-0x000fffff]
[ 3569.553764] i915 0000:37:00.0: [drm] Using Transparent Hugepages
[ 3569.556825] i915 0000:37:00.0: [drm] Local memory IO size: 0x000000013cc00000
[ 3569.556827] i915 0000:37:00.0: [drm] Local memory available: 0x000000013cc00000
[ 3569.562817] i915 0000:37:00.0: [drm] GT0: GuC firmware i915/dg2_guc_70.9.1.bin version 70.9.1
[ 3569.562821] i915 0000:37:00.0: [drm] GT0: HuC firmware i915/dg2_huc_7.10.3_gsc.bin version 7.10.3
[ 3569.576980] i915 0000:37:00.0: [drm] GT0: GUC: submission enabled
[ 3569.576983] i915 0000:37:00.0: [drm] GT0: GUC: SLPC enabled
[ 3569.577276] i915 0000:37:00.0: [drm] GT0: GUC: RC enabled
[ 3569.590333] i915 0000:37:00.0: GT0: local0 bcs'0.0 clear bandwidth:74358 MB/s
[ 3569.591180] [drm] Initialized i915 1.6.0 20201103 for 0000:37:00.0 on minor 1
[ 3569.623597] i915 0000:3c:00.0: [drm] GT count: 1, enabled: 1
[ 3569.623600] mei_gsc i915.mei-gscfi.14080: FW not ready: resetting: dev_state = 2 pxp = 0
[ 3569.623617] clipped [mem 0x000a0000-0x000bffff] to [mem 0x00100000-0x000bffff] for e820 entry [mem 0x0009f000-0x000fffff]
[ 3569.623621] clipped [mem 0x000c8000-0x000cffff] to [mem 0x00100000-0x000cffff] for e820 entry [mem 0x0009f000-0x000fffff]
[ 3569.623642] mei_gsc i915.mei-gscfi.14080: unexpected reset: dev_state = ENABLED fw status = 00000345 84670000 00000000 00000000 E0020002 00000000
[ 3569.624324] mei_gsc i915.mei-gsc.14080: FW not ready: resetting: dev_state = 2 pxp = 2
[ 3569.624349] mei_gsc i915.mei-gsc.14080: unexpected reset: dev_state = ENABLED fw status = 00000345 84670000 00000000 00000000 E0020002 00000000
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.