Set timeout for KubeHostUpgradeControlPlaneStep to 420s

This updates KubeHostUpgradeControlPlaneStep timeout
to 420s. This is intentionally engineered to be larger than
the resultant time for sysinv code to reach completion of the
Kubernetes Upgrade control-plane step with retries and
accounting for failure.

The timeout is engineered using the following equation.
This accounts for retries, hitting kubeadm upgrade timeout
each try, and some buffer for the sysinv report callback
mechanism.

nfv_timeout = ImageDownloadTime + retries*
                        (UpgradeControlPlaneTimeout + buffer)

Following are the engineered parameters:

ImageDownloadTime = 0s (images are pre-pull before this step)
UpgradeManifestTimeout = 3 minutes
buffer = 30s
2 retries

Result:
Engineered puppet timeout for upgrade control-plane:
= UpgradeControlPlaneTimeout + buffer = 3*60s + 30s = 210s

Engineered NFV timeout:
= 0s + 2(180s + 30s) = 420s

Test Plan:
PASS: Perform orchestrated k8s upgrade, manually STOP kubeadm process
      during k8s upgrade control-plane step. Check logs to verify
      puppet timeout and also verify sysinv attempts retry mechanism
      before nfv timeout.

Partial-Bug: 2056326

Change-Id: I73ab8ea7cd7fc3816372260983c4b54a02cdcc4c
Signed-off-by: Saba Touheed Mujawar <sabatouheed.mujawar@windriver.com>
This commit is contained in:
Saba Touheed Mujawar 2024-03-13 12:54:40 -04:00
parent bac2f0a09e
commit 4cbb31454d
1 changed files with 1 additions and 1 deletions

View File

@ -4711,7 +4711,7 @@ class KubeHostUpgradeControlPlaneStep(AbstractKubeHostUpgradeStep):
"""
def __init__(self, host, to_version, force, target_state, target_failure_state,
timeout_in_secs=600):
timeout_in_secs=420):
super(KubeHostUpgradeControlPlaneStep, self).__init__(
host,
to_version,