Add pxeboot mtcAlive alarm to fault management

This update introduces a new maintenance group alarm ; 200.003

This new alarm is minor and management affecting if asserted.
It is considered management affecting for the upgrades case because
the pxeboot network is needed to upgrade a node.

The alarm represents a communication/messaging failure between the
active controller mtcAgent process and the mtcClient that runs
on each node.

Test Plan:

PASS: Verify alarm attributes
PASS: - code of 200.003
PASS: - assertion cause text
PASS: - proposed repair action text
PASS: - suppression option
PASS: - does not inhibit other alarms
PASS: - affect of assertion on upgrade healthcheck
PASS: Verify ability to assert and clear
PASS: Verify fm logging for the above assertion and clear

Story: 2010940
Task: 49789
Change-Id: I507d30213674c5b1e24fcfebe15c6a87bad74358
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
This commit is contained in:
Eric MacDonald 2024-03-28 16:26:26 +00:00
parent e0c1825635
commit aef702fb01
1 changed files with 16 additions and 0 deletions

View File

@ -566,6 +566,22 @@
Degrade_Affecting_Severity: none
Context: starlingx
200.003:
Type: Alarm
Description: <hostname> pxeboot network communication failure.
Entity_Instance_ID: host=<hostname>
Severity: minor
Proposed_Repair_Action: Administratively Lock and Unlock host to recover. If problem persists, contact next level of support.
Maintenance_Action: none
Inhibit_Alarms: False
Alarm_Type: communication
Probable_Cause: unknown
Service_Affecting: False
Suppression: False
Management_Affecting_Severity: warning
Degrade_Affecting_Severity: none
Context: starlingx
200.004:
Type: Alarm
Description: |-