integ/kubernetes
Jim Gauld 05bbc77057 Improve shutdown of containerd
This update is to prevent nodes from crashing while powering
off during graceful shutdown (or reboot). This improves timing
and shutdown of containerd.service.

The containerd shutdown script stops all containers via
'crictl stop' with 5 second timeout, followed by stop all
pods via 'crictl stopp'. This cleans up lingering /pause
sandbox containers.

This modifies the arguments to xargs and crictl to let xargs
deal with parallelism instead of batching to crictl.
crictl appears to do the stop operations serially.

The number stop in parallel is engineered to 10.

Engineering the number of stop in parallel in relation to
shutdown timings under stress load will be addressed in a
subsequent update. The engineering TC should align with
customer requirements.

When testing containerd shutdown under the stress of multiple
pods writing to a shared PersistentVolume, even the new parallel
shutdown code is not sufficient to complete the shutdown within
the default 90-second timeout. Additional changes will be needed
to enable clean shutdown under those circumstances.

Partial-Bug: 2043069

Test plan:
- PASS - build-image, install and boot up on AIO-SX
- PASS - perform reboot and verify /var/log/daemon.log
         has new k8s-container-cleanup.sh logs
         for 'Stopping all pods' and 'Stopping all containers',
         and that drbd stops after containerd.
- FAIL - verify containerd shutdown works under stress with
         the new parallel stop pods parameter NPAR=10.
         The stress load uses ReadWriteMany PVC, and multiple
         pods, each writing to the shared PVC.

Change-Id: Ibfc0a474a40344a629b3f0780449906a9c6b03ba
Signed-off-by: Jim Gauld <James.Gauld@windriver.com>
2023-11-09 12:12:48 -05:00
..
chartmuseum Upversion chartmuseum from 0.12.0 to 0.13.0 2023-08-08 09:59:54 -03:00
cni debian-pkg: Uprev cni plugins 2023-04-14 20:03:39 +00:00
containerd Improve shutdown of containerd 2023-11-09 12:12:48 -05:00
crictl/debian Fix lint errors identified by Zuul pylint job 2023-03-15 12:07:17 +00:00
docker-distribution Debian: docker-registry: CVE-2023-2253 2023-06-24 15:12:32 +08:00
etcd Update debian package versions to use git commits 2023-03-01 11:27:50 -05:00
helm Enforce Helm charts uniqueness 2023-10-06 12:12:07 -03:00
k8s-cni-cache-cleanup Update k8s-cni-cache-cleanup ver based on git 2023-02-21 21:19:18 +00:00
k8s-pod-recovery Update k8s-pod-recovery pkg ver based on git 2023-02-22 15:57:27 +00:00
kubernetes-1.18.1/centos/files Remove kubernetes 1.18, 1.19, 1.20 pkgs 2022-02-23 15:24:03 +00:00
kubernetes-1.21.8 Fix lint errors identified by Zuul pylint job 2023-03-15 12:07:17 +00:00
kubernetes-1.22.5 Fix lint errors identified by Zuul pylint job 2023-03-15 12:07:17 +00:00
kubernetes-1.23.1 Fix lint errors identified by Zuul pylint job 2023-03-15 12:07:17 +00:00
kubernetes-1.24.4/debian Add sriov-fec-system namespace to the platform infra list in kubelet 2023-08-31 11:07:43 -03:00
kubernetes-1.25.3/debian Add sriov-fec-system namespace to the platform infra list in kubelet 2023-08-31 11:07:43 -03:00
kubernetes-1.26.1/debian tox: fixed warnings 2023-09-06 17:54:55 -03:00
kubernetes-1.27.5/debian Add kubernetes 1.27.5 patches 2023-09-08 13:03:16 -04:00
kubernetes-unversioned Update kubelet.kubeconfig environment variable 2023-07-17 17:58:48 -04:00
n3000 cengn reference removal 2023-09-14 09:56:20 -04:00
plugins Fix for dwz compression error in isolcpus-device-plugin 2023-09-06 09:45:39 -04:00
runc/debian Upgrade runc to 1.1.7 2023-05-29 07:26:53 -04:00