From efa09aa3db32e8fc4427977b6e2614d37b02cb63 Mon Sep 17 00:00:00 2001 From: Girish Subramanya Date: Mon, 3 Oct 2022 11:41:36 -0400 Subject: [PATCH] Alarm Hostname controller function has in-service failure reported When compute services remain healthy: - listing alarms shall not refer to the below Obsoleted alarm - 200.012 alarm hostname controller function has an in-service failure This update deletes definition of the obsoleted alarm and any references 200.012 is removed in events.yaml file Also updated any reference to this alarm definition. Need to also raise a Bug to track the Doc change. Test Plan: Verify on a Standard configuration no alarms are listed for hostname controller in-service failure Code (removal) changes exercised with fix prior to ansible bootstrap and host-unlock and verify no unexpected alarms Regression: There is no need to test the alarm referred here as they are obsolete Closes-Bug: 1991531 Signed-off-by: Girish Subramanya Change-Id: I255af68155c5392ea42244b931516f742fa838c3 --- fm-doc/fm_doc/events.yaml | 15 --------------- 1 file changed, 15 deletions(-) diff --git a/fm-doc/fm_doc/events.yaml b/fm-doc/fm_doc/events.yaml index d15a423f..50c54228 100755 --- a/fm-doc/fm_doc/events.yaml +++ b/fm-doc/fm_doc/events.yaml @@ -554,21 +554,6 @@ Management_Affecting_Severity: none Degrade_Affecting_Severity: none -200.012: - Type: Alarm - Description: controller function has in-service failure while compute services remain healthy. - Entity_Instance_ID: host= - Severity: major - Proposed_Repair_Action: Lock and then Unlock host to recover. Avoid using 'Force Lock' action as that will impact compute services running on this host. If lock action fails then contact next level of support to investigate and recover. - Maintenance_Action: "degrade - requires manual action" - Inhibit_Alarms: False - Alarm_Type: operational-violation - Probable_Cause: communication-subsystem-failure - Service_Affecting: True - Suppression: True - Management_Affecting_Severity: warning - Degrade_Affecting_Severity: major - 200.013: Type: Alarm Description: compute service of the only available controller is not poperational. Auto-recovery is disabled. Deggrading host instead.