Add alarm when a node is tainted.

Currently there is no alarm when a node is tainted.
When a node is tainted, some of the services are
down and the system is not in a healthy state.
Adding this alarm will help notify the user on
the node being tainted.

Test Plan:
PASSED: On a DX system, lock the in-active
controller to check if taint alarm is raised.
PASSED: On a DX system, unlock the previously
locked controller to check if the taint alarm
is removed.

Change-Id: I929ddb45b75f1e4b097b84919f703d458d8fa39e
Signed-off-by: Vanathi.Selvaraju <vanathi.selvaraju@windriver.com>
This commit is contained in:
Vanathi.Selvaraju 2024-01-04 13:30:37 -05:00
parent f8716aa344
commit 115ff4d4bb
2 changed files with 18 additions and 0 deletions

View File

@ -72,6 +72,7 @@ FM_ALARM_ID_FS_USAGE = ALARM_GROUP_GENERAL + ".104"
FM_ALARM_ID_IMAGE_CONVERSION = ALARM_GROUP_GENERAL + ".105"
FM_ALARM_ID_CONTROLLERS_KERNEL_MISMATCH = ALARM_GROUP_GENERAL + ".120"
FM_ALARM_ID_PROVISIONED_KERNEL_MISMATCH = ALARM_GROUP_GENERAL + ".121"
FM_ALARM_ID_NODE_TAINTED = ALARM_GROUP_GENERAL + ".122"
# Maintenance Log id
FM_LOG_ID_HOST_DISCOVERED = ALARM_GROUP_MAINTENANCE + ".020"

View File

@ -544,6 +544,23 @@
Degrade_Affecting_Severity: critical
Context: starlingx
100.122:
Type: Alarm
Description: Node <hostname> tainted.
Entity_Instance_ID: host=<hostname>
Severity: minor
Proposed_Repair_Action: "kubectl taint nodes <hostname> services=disabled:NoExecute-"
Maintenance_Action: none
Inhibit_Alarms:
Alarm_Type: operational-violation
Probable_Cause: out-of-service
Service_Affecting: True
Suppression: False
Management_Affecting_Severity: warning
Degrade_Affecting_Severity: none
Context: starlingx
#---------------------------------------------------------------------------
# MAINTENANCE
#---------------------------------------------------------------------------