specs/doc/source/specs/stx-10.0/approved/os-2011000-uprevision-kerne...

15 KiB

StarlingX: Uprevision the kernel to v6.6

Storyboard: 2011000

We propose aligning the StarlingX kernel to the latest long-term support (LTS) release, which is v6.6 as of this writing, to let StarlingX take advantage of upstream improvements made to the Linux kernel since the release of StarlingX's current kernel based on v5.10.

Problem description

StarlingX currently ships with kernel version 5.10, while the latest LTS release available as of this writing is version 6.6 according to kernel.org's Active Kernel Releases page. We propose aligning the StarlingX kernel to v6.6, as made available by the Yocto Project's linux-yocto repository.

The intent of this uprevision is to let StarlingX take advantage of upstream improvements to the kernel that have been made since the release of the current StarlingX kernel, with the stability and extended maintenance of an LTS release.

Updated device drivers inherited from the v6.6 kernel will let StarlingX:

  • improve support for hardware platforms currently supported by StarlingX, such as Intel Sapphire Rapids, with respect to features such as EDAC (Error Detection and Correction) and hardware performance monitor unit (PMU) counters (for perf),
  • support newer platforms and peripheral hardware, examples for the latter of which include storage controllers managed by device drivers like mpt3sas, mpi3mr and megaraid_sas, and
  • take advantage of newer in-tree device drivers (such as ice, i40e and iavf) which introduce features that have so far been provided by their out-of-tree counterparts in StarlingX.

In addition, the v6.6 kernel also offers newer features that are expected to introduce performance improvements, such as the Earliest Eligible Virtual Deadline First (EEVDF) scheduler and memory folios.

The v6.6 kernel also offers bug fixes that may not have been backported to the v5.10 kernel, which is expected to increase stability.

Use Cases

With the move to the v6.6 kernel, StarlingX end users and deployers will be able to use StarlingX on newer platforms that require more recent device drivers and (potentially) recent core kernel features.

We identify the following potential impacts to StarlingX's stakeholders with this kernel uprevision:

  • End users: The kernel uprevision is expected to be transparent to the current use cases of end users, thanks to the kernel's stable user-space ABI.
  • Deployers: There may be slight impacts to StarlingX deployers, due to potential changes to the kernel's configuration interfaces such as the kernel command line, sysctl options, and the procfs and sysfs file systems, in case deployers are customizing the kernel's run-time settings.
  • End users and deployers: Given that the kernel development community does not maintain a stable in-tree kernel API, the kernel uprevision will likely require the recompilation and redeployment of kernel modules that may be in use by end users and/or deployers for their downstream projects.
  • Developers: Higher-level components in StarlingX are not expected to be impacted, but as with the impacts affecting Deployers, the proposed kernel uprevision may require modifications to the StarlingX installer, Ansible playbooks and other system configuration tooling.
  • Developers: StarlingX kernel maintainers' development workflows are expected to be improved with the proposed kernel uprevision by migrating the kernel packages' changelog files and the kernel configuration from patch files to regular files in the StarlingX kernel packaging repository (starlingx/kernel).

Proposed change

This specification proposes:

  • To uprevision the StarlingX kernel to Linux kernel LTS release 6.6 for the StarlingX 10.0 release.

  • To continue the use of the Yocto Project's linux-yocto repository as the upstream source code repository and to base the StarlingX kernel's source code on the v6.6/standard/base branch for the standard kernel and the v6.6/standard/preempt-rt/base branch for the low-latency kernel.

  • To utilize the debian directory in the bullseye-backports branch of the Debian kernel packaging repository, which, despite targeting kernel v6.1, was observed to build the v6.6 kernel with changes.

  • To streamline the maintenance of the StarlingX kernel by simplifying changes to the kernel configuration and changelog files.

    As of this writing, the StarlingX kernel's changelog and configuration are maintained as patch files that make the necessary changes to the changelog and kernel configuration files acquired from Debian's kernel packaging repository. These patch files are stored in in the kernel/kernel-{rt,std}/debian/deb_patches directories of the starlingx/kernel repository. (Example patch file)

    With this specification, we propose making the StarlingX kernel configuration and changelog standalone files in the kernel/kernel-{rt,std}/debian/source directories of the starlingx/kernel repository. Such a change avoids unnecessary churn in the deb_patches directory and makes it easier to update the StarlingX kernel's changelog and configuration.

Alternatives

We have thought of a number of alternatives to this proposal:

  1. Do not uprevision the StarlingX kernel (i.e., stay with the v5.10 kernel)

    Given that the v5.10 kernel and the v6.6 kernel currently have the same planned end-of-life (EOL) time frame (December, 2026) according to kernel.org's Active Kernel Releases page as of this writing, not uprevisioning the StarlingX kernel could be considered as an alternative.

    The concern with this alternative is that the v5.10 kernel has started to show its age with respect to hardware support, which will hold back the StarlingX community from being able to deploy StarlingX on newer hardware platforms or to make use of newer peripheral hardware. The StarlingX community could potentially continue to backport hardware support from newer kernel versions to the v5.10 kernel, but the cost for such maintenance would be prohibitive.

    In addition, with the increasing divergence of the mainline kernel from the v5.10 kernel, it is possible that fixes for bugs affecting the StarlingX community will not be backported to the v5.10 kernel by upstream maintainers.

  2. Opt for another LTS kernel releases (such as v5.15 or v6.1)

    Other LTS kernel releases that could be considered for this proposal are v5.15 and v6.1, both of which have the same EOL time frame (December, 2026) as the v6.6 kernel release chosen for this proposal. Given the older code bases that the v5.15 and v6.1 releases have, we do not see an advantage for using them, other than to avoid an undesirable aspect of the v6.6 kernel, which we are not aware of.

    Furthermore, the v6.6 kernel's more recent code base will allow StarlingX to reduce its dependency on out-of-tree device drivers, by making use of their in-tree counterparts, which will be covered by another specification proposal.

  3. Opt for an upstream other than the linux-yocto repository

    The final aspect that could be considered is the use of the linux-yocto project as the upstream kernel source code repository. It is possible, for example, to use the linux-stable team's releases for the standard kernel and the linux-rt team's repositories for the PREEMPT_RT kernel.

    For this topic, we would like to refer the reader to the relevant subsection of the specification proposal for the 5.10 kernel uprevision approved for the StarlingX 6.0 release: https://docs.starlingx.io/specs/specs/stx-6.0/approved/os_2008921_kernel_v510.html#alternatives

Data model impact

None

REST API impact

None

Security impact

Given the nature of a kernel uprevision, we would like to note that there could be a difficult-to-qualify impact on security, as a newer kernel code base could inadvertently include recently introduced bugs with security implications.

We would also like to note that staying with older kernel releases is not a viable approach to avoid negative impacts on security, because security-related bug-fixes may not always be backported to older kernel releases due to a variety of reasons.

Updating to the most recent LTS kernel releases is generally recommended by the kernel development community when security is considered:

Other end user impact

None

Performance Impact

Given the scope of a kernel uprevision activity, it is difficult to predict whether there will be a significant negative performance impact on use cases. We would also like to note that the v6.6 kernel incorporates a new scheduler (EEVDF) and that there is a possibility that the new scheduler will impact certain workloads.

To guard against performance regressions, there will be work items for carrying out performance tests with cyclictest and a Kubernetes-based network performance benchmark, and significant degradations in performance are expected to be resolved during the implementation of this proposal.

Other deployer impact

None

Developer impact

(Please see the Use Cases section.)

Upgrade impact

None

Implementation

Assignee(s)

Primary assignee:

  • Li Zhou (lzhou2)

Other contributors:

  • Jiping Ma (jma11)

Repos Impacted

The following is a preliminary list:

  • starlingx/kernel
  • starlingx/tools

Work Items

The following work items are expected to be carried out, with the understanding that the story board will be updated as more work items are found to be necessary.

  • Uprevision the StarlingX standard kernel (kernel-std) to v6.6.y.
  • Uprevision the StarlingX PREEMPT_RT kernel (kernel-rt) to v6.6.y.
  • Adapt/upgrade the out-of-tree kernel drivers utilized by StarlingX to work with the v6.6 kernel. (Parts of this work item may be eliminated in case a separate specification proposal on migration from out-of-tree to in-tree device drivers is approved.)
  • Performance testing will be required to guard against performance regressions.
  • Regression testing will be required for some of the StarlingX kernel-related bugs that had been fixed in the past, due to the contextual changes that have been encountered while forward-porting the bug-fix patches from the v5.10 kernel to the v6.6 kernel.
  • Finally, basic sanity tests will be required for the out-of-tree drivers that need adaptations for the v6.6 kernel.

Dependencies

Interactions with other proposals

We would also like to note that the StarlingX kernel's packaging will need further updates due to a forthcoming StarlingX specification proposal to update the StarlingX base distribution from Debian 11 (bullseye) to Debian 12 (bookworm). With the approval of the latter proposal, the kernel packaging updates are expected to involve:

  • Changing the Debian kernel packaging repository base branch from bullseye-backports to bookworm-backports, and
  • Migration from Debian 11's gcc-10-based build toolchain to Debian 12's gcc-12-based build toolchain.

Finally, as a side effect of this kernel uprevision, it will additionally be possible to migrate StarlingX to in-tree versions of certain device drivers, which will be the subject of a separate StarlingX specification proposal.

Interactions with past work

This proposal builds on past work on removing the hard-coded ABI name (i.e., "5.10.0-6") from the kernel package names and from numerous locations in the packaging, which allows the StarlingX community to avoid the need to change the ABI name for every major kernel version change. The commits in question can be seen at: https://review.opendev.org/q/topic:%22rm-abiname%22

Testing

Given the nature of this uprevisioning activity, sanity and regression tests will be carried out, which are expected to include:

  • Verification of basic kernel features using test suites such as the LTP (Linux Test Project) test suite.
  • Deployment of StarlingX on multiple types of nodes (e.g., All-in-One Simplex/Duplex, standard, distributed cloud/sub-cloud) on a variety of hardware platforms, using the low-latency and the standard installation profiles.
  • Performance tests (such as cyclictest and Kubernetes network performance test) to ensure that the new kernel's performance is acceptable.
  • Regression tests for bugs that had been fixed by patching the StarlingX kernel, if the patches in question are affected by the changes between the v5.10 and v6.6 kernels.
  • Basic sanity tests for the out-of-tree drivers that needed adaptations for the v6.6 kernel.

Documentation Impact

At a minimum, the documentation will need to be updated to refer to the new kernel version.

Documentation pages with references to the kernel's interfaces (such as command line arguments, sysctl options, dmesg output), if any, will need to be updated for the new kernel as well.

References

(None)

History

Revisions
Release Name Description
stx-10.0 Introduced