StarlingX System Configuration Management
Go to file
Saba Touheed Mujawar 4c42927040 Add retry robustness for Kubernetes upgrade control plane
In the case of a rare intermittent failure behaviour during the
upgrading control plane step where puppet hits timeout first before
the upgrade is completed or kubeadm hits its own Upgrade Manifest
timeout (at 5m).

This change will retry running the process by
reporting failure to conductor when puppet manifest apply fails.
Since it is using RPC to send messages with options, we don't get
the return code directly and hence, cannot use a retry decorator.
So we use the sysinv report callback feature to handle the
success/failure path.

TEST PLAN:
PASS: Perform simplex and duplex k8s upgrade successfully.
PASS: Install iso successfully.
PASS: Manually send STOP signal to pause the process so that
      puppet manifest timeout and check whether retry code works
      and in retry attempts the upgrade completes.
PASS: Manually decrease the puppet timeout to very low number
      and verify that code retries 2 times and updates failure
      state
PASS: Perform orchestrated k8s upgrade, Manually send STOP
      signal to pause the kubeadm process during step
      upgrading-first-master and perform system kube-upgrade-abort.
      Verify that upgrade-aborted successfully and also verify
      that code does not try the retry mechanism for
      k8s upgrade control-plane as it is not in desired
      KUBE_UPGRADING_FIRST_MASTER or KUBE_UPGRADING_SECOND_MASTER
      state
PASS: Perform manual k8s upgrade, for k8s upgrade control-plane
      failure perform manual upgrade-abort successfully.
      Perform Orchestrated k8s upgrade, for k8s upgrade control-plane
      failure after retries nfv aborts automatically.

Closes-Bug: 2056326

Depends-on: https://review.opendev.org/c/starlingx/nfv/+/912806
            https://review.opendev.org/c/starlingx/stx-puppet/+/911945
            https://review.opendev.org/c/starlingx/integ/+/913422

Change-Id: I5dc3b87530be89d623b40da650b7ff04c69f1cc5
Signed-off-by: Saba Touheed Mujawar <sabatouheed.mujawar@windriver.com>
2024-03-19 08:49:36 -04:00
api-ref/source Improve kube-rootca-get-id API and error handling 2023-11-24 09:16:48 -05:00
config-gate Update debian package versions to use git commits 2023-02-10 20:11:06 +00:00
controllerconfig Fix upgrade-script not expecting additional parameter 2024-03-11 13:48:22 -03:00
devstack Deprecate old policy engine and restrict access 2022-08-10 11:18:38 -03:00
doc Fix tsconfig/root constraints file in tox.ini 2024-03-04 22:22:31 +00:00
releasenotes Remove host hardware sysinv profile 2021-10-18 18:01:40 -03:00
storageconfig Remove the use of the mgmt_ip field in host table 2023-11-01 10:30:21 -04:00
sysinv Add retry robustness for Kubernetes upgrade control plane 2024-03-19 08:49:36 -04:00
tmp/patch-scripts/EXAMPLE_SYSINV/scripts StarlingX open source release updates 2018-05-31 07:35:52 -07:00
tools/docker/images Enable kubernetes SCTPSupport feature 2019-09-03 19:23:05 +00:00
tsconfig Upgrade changes to support MGMT FQDN 2024-03-05 12:42:21 -03:00
workerconfig Remove the use of the mgmt_ip field in host table 2023-11-01 10:30:21 -04:00
.gitignore Minor zuul and tox file cleanup after manifest re-org 2019-09-06 15:40:37 -05:00
.gitreview OpenDev Migration Patch 2019-04-19 19:52:42 +00:00
.yamllint clear yamllint errors under stx-config 2018-09-12 21:11:57 +08:00
.zuul.yaml Update controllerconfig tox environment for debian 2023-05-31 15:25:25 +00:00
CONTRIBUTORS.wrs StarlingX open source release updates 2018-05-31 07:35:52 -07:00
LICENSE StarlingX open source release updates 2018-05-31 07:35:52 -07:00
README.rst starlingx/config README improvement 2023-07-19 12:18:04 -03:00
bindep.txt py3: Add py39 gate for sysinv 2021-08-27 08:39:06 -04:00
centos_build_layer.cfg Build layering, add layer build config file 2019-10-15 12:29:05 +08:00
centos_dev_wheels.inc Config file changes to add 'tsconfig' after relocation from 'update' 2019-09-05 11:51:05 -04:00
centos_iso_image.inc Merge sysinv_fpga_agent with sysinv_agent 2022-10-03 14:12:28 -04:00
centos_pkg_dirs Merge sysinv_fpga_agent with sysinv_agent 2022-10-03 14:12:28 -04:00
centos_pkg_dirs_containers Config file changes for packages relocated to repo 'openstack-armada-app' 2019-09-05 10:42:00 -04:00
centos_stable_wheels.inc Config file changes to add 'tsconfig' after relocation from 'update' 2019-09-05 11:51:05 -04:00
debian_build_layer.cfg Add debian_build_layer.cfg file 2021-10-05 14:50:08 -04:00
debian_iso_image.inc Setup debian build directory and ipsec-auth package 2024-01-26 09:46:14 -03:00
debian_pkg_dirs Setup debian build directory and ipsec-auth package 2024-01-26 09:46:14 -03:00
debian_stable_wheels.inc debian: Add sysinv wheel to the build 2022-11-21 13:33:24 +00:00
test-requirements.txt Calling an additional shell lint command from zuul 2021-06-03 17:35:50 -05:00
tox.ini Fix tsconfig/root constraints file in tox.ini 2024-03-04 22:22:31 +00:00

README.rst

config

The starlingx/config repository handles the StarlingX configuration management services.

Its key component is the System Inventory Service (Sysinv), which provides the system command-line interface (CLI)1.

This repository is not intended to be developed standalone, but rather as part of the StarlingX Source System, which is defined by the StarlingX manifest2.

References


  1. https://docs.starlingx.io/cli_ref/system.html↩︎

  2. https://opendev.org/starlingx/manifest.git↩︎