2210c71216
When Multi-Node Failure Avoidance (MNFA) occurs, maintenance commands the Heartbeat Agent to slow down by a factor of 4. The rate recovery following a MNFA is not occurring. Update https://review.opendev.org/#/c/701057 made a condition check change that introduced this issue by requiring mnfa_timeout to be non-zero before an attempt is made to recover heartbeat period following MNFA recovery. This update switches that condition check to use more specific mnfa_backoff state tracker and because MNFA is a global maintenance mode feature rather than a node specific feature, moves the recovery check code from the node level fsm into a mnfa_recovery_handler called in the main select loop. Test Plan: PASS: Verify MNFA handling/recovery with mnfa_timeout!=0 that expires. PASS: Verify MNFA handling/recovery when mnfa_timeout!=0 but before the timeout expires. PASS: Verify MNFA handling/recovery when mnfa_timeout=0 PASS: Verify MNFA backoff rate recovery over mtcAgent process restart. PASS: Verify MNFA backoff rate is sent to hbsAgent if hbsAgent restarts while MNFA his active. Change-Id: I8da5a000ab503692c7cfa620233ed8aa772c50f8 Closes-Bug: #1893212 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com> |
||
---|---|---|
.. | ||
Makefile | ||
nodeClass.cpp | ||
nodeClass.h | ||
nodeCmds.h |