diff --git a/doc/source/specs/stx-8.0/approved/containerization-2010368-containerization-components-refresh.rst b/doc/source/specs/stx-8.0/approved/containerization-2010368-containerization-components-refresh.rst new file mode 100644 index 0000000..7a06f2d --- /dev/null +++ b/doc/source/specs/stx-8.0/approved/containerization-2010368-containerization-components-refresh.rst @@ -0,0 +1,273 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. http://creativecommons.org/licenses/by/3.0/legalcode + +=================================== +Containerization Components Refresh +=================================== + +Storyboard: +https://storyboard.openstack.org/#!/story/2010368 + +This story covers the refresh of the containerization components in StarlingX, +including Kubernetes but also the other supporting components such as +containerd and the various components which run as Kubernetes plugins. + +Problem description +=================== + +The containerization components used by StarlingX are starting to become dated +and in need of a refresh. This would include Kubernetes itself, as well as the +components it relies on (containerd, runc) and the plugins that extend its +behaviour (SR-IOV device plugin, Calico, Multus). + + +Use Cases +--------- + +* Deployer wants to upgrade to a fully-supported version of Kubernetes on a + running StarlingX system with minimal impact to running applications. +* QA tester wants to install and test all intermediate versions of Kubernetes + to ensure that they are stable and functional. + +Proposed change +=============== + +At a high level, the components to be updated are as follows: + +* Kubernetes +* Containerd/Crictl/Runc + +We are hoping to drop the Docker container runtime, so we are not planning on +upversioning it as part of this feature. The sections below document each of +the components in more detail. + +Overview +-------- + +The upgrade from K8s 1.24 to 1.26 requires an incremental upgrade to each minor +release. We do not have the ability to prescribe when the customer will +perform the incremental upgrades and so we need to be able to run for extended +periods (days to weeks) on any of the intermediate versions. + +Following the existing implementation, all supported versions of Kubernetes +shall be packaged onto the system into separate versioned installation paths, +and they will all show up in the output of *system kube-version-list*. +The separate runtime version may then be chosen based on the specific host +requirements as part of the K8s upgrade procedure. + +Newly-installed systems would default to the latest version of K8s. + +In addition to the Kubernetes upgrade we also want to upgrade the various +other components related to containers. In order to de-risk and simplify the +Kubernetes upgrades, it is proposed that we upgrade the various containerization +components as follows: + +#. Upgrade Kubernetes to 1.25 using the *system kube-upgrade-start* procedure + discussed above. As part of this, upgrade the containerized components + (Calico, Multus, the SR-IOV CNI, and the SR-IOV device plugin) via the + *system kube-upgrade-networking* step of the existing Kubernetes upgrade. + +#. Upgrade Kubernetes to 1.26 using the *system kube-upgrade-start* procedure. + Everything else stays the same version. + + + +Deb-Based Components +-------------------- + +The Deb-based components will be updated as part of the upgrade to StarlingX 8.0. +This assumes that the new versions of the containerized components can work +with the existing version of the Deb-based components. This will need to be +validated. + +Containerd +^^^^^^^^^^ + +Containerd is the container runtime that we use. The upstream project is +https://github.com/containerd/containerd and we're currently running version +1.4.12. + +As of K8s 1.26 we need to move to containerd 1.6. The Debian package for +containerd 1.6 comes from "bookworm" and would require us to bring in a newer +glibc and newer python3, which is not something we want to do at this point. + +As a workaround, we are pulling in the prebuilt binaries from the containerd +github project, which work fine with our existing packages. + +We also package "runc", this comes from the upstream project at +https://github.com/opencontainers/runc and we currently use version +1.0.2. As documented at +https://kubernetes.io/blog/2022/12/09/kubernetes-v1-26-release K8s 1.26 will +no longer support containerd 1.5 and lower. Accordingly we are upgrading runc +to 1.1.7 to be aligned with something closer to what the upstream K8s versions +are using. The Debian package for runc comes from "bookworm" and would require +us to bring in a newer glibc and newer python3, which is not something we want +to do at this point. As a workaround, we are pulling in the prebuilt binaries +from the runc github project, which work fine with our existing packages. + + + +Alternatives +------------ + +It would be possible to use the existing mechanism to upgrade Kubernetes via a +series of software patches (multiple patches per Kubernetes version). However, +this process is complex and comes with significant overhead for patch management +and delivery. + + +Data model impact +----------------- + +None + + +REST API impact +--------------- + +None + + +Security impact +--------------- + +None + +Other end user impact +--------------------- + +None + +Performance Impact +------------------ + +No significant change expected. + +Other deployer impact +--------------------- + +None + +Developer impact +---------------- + +No significant change. + +Upgrade impact +-------------- + +As part of upgrading to K8s 1.25, the "master" taint on standalone controllers +will change to a "control-plane" taint. Applications need to support +tolerations for both if they want to run on controller nodes. Applications can +use the "control-plane" label and don't need to support the "master" label any +more. + + +Implementation +============== + +Assignee(s) +----------- + +Who is leading the writing of the code? Or is this a blueprint where you're +throwing it out there to see who picks it up? + +If more than one person is working on the implementation, please designate the +primary author and contact. + +Primary assignee: + Chris Friesen (cbf123) + +Other contributors: + Jim Gauld (jgauld) + + +Repos Impacted +-------------- + +* config +* integ +* stx-puppet +* ansible-playbooks + +Work Items +---------- + +* Add K8s 1.25 and 1.26 to the load, with appropriate versions of golang + +* Update containerd and runc + +* Validate platform apps, stx-openstack, etc., against Kubernetes 1.24 with updated + containerd/runc. + +* Validate the newer versions of the containerized components (Calico, Multus, + SR-IOV CNI, SR-IOV Device Plugin) against Kubernetes 1.25, ensuring that + basic functionality works. + +* Once validated, change the installation default to 1.25 with the newer + containerized components. + +* Validate platform apps, stx-openstack, etc., against Kubernetes 1.26 + +* Once validated, change the installation default to 1.26 + +* Update the customer documentation describing the procedure for upgrading + Kubernetes. Add release notes highlighting the upstream Kubernetes release + notes for customers to validate their own applications to ensure + compatibility with newer releases of Kubernetes. + + +Dependencies +============ + +None + + +Testing +======= + +Kubernetes upgrades from 1.24 to 1.26 must be tested in the following +StarlingX configurations: + +* AIO-SX +* AIO-DX +* Standard with controller storage +* Standard with dedicated storage +* Distributed cloud + +The testing can be performed on hardware or virtual environments. Sanity must +be performed on each intermediate Kubernetes version. + + +Documentation Impact +==================== + +The existing Kubernetes upgrade documentation will need to be updated to +reflect the fact that there will no longer be software patching involved. + +The Release Notes will need to be updated to reflect the requirement to +upgrade to Kubernetes 1.26 as part of STX 8.0. + +The config API reference will also need updates. + +References +========== + +N/A + + +History +======= + +Optional section intended to be used each time the spec is updated to describe +new design, API or any database schema updated. Useful to let reader understand +what's happened along the time. + +.. list-table:: Revisions + :header-rows: 1 + + * - Release Name + - Description + * - STX-8.0 + - Introduced