Revert "Update events.yaml for specific alarms"

This reverts commit baead557fb.

Reason for revert: Change breaks the build. 270.001 alarm has to remain defined because it is still being referenced.

Change-Id: I136928c90634e05e3026dddc9443cc2c59203320
This commit is contained in:
Agustin Carranza 2022-08-08 17:04:35 +00:00
parent baead557fb
commit 6ed32029db
1 changed files with 27 additions and 13 deletions

View File

@ -895,6 +895,21 @@
#---------------------------------------------------------------------------
# VM Compute Services
#---------------------------------------------------------------------------
270.001:
Type: Alarm
Description: "Host <host_name> compute services failure[, reason = <reason_text>]"
Entity_Instance_ID: host=<host_name>.services=compute
Severity: critical
Proposed_Repair_Action: Wait for host services recovery to complete; if problem persists contact next level of support
Maintenance_Action:
Inhibit_Alarms:
Alarm_Type: processing-error
Probable_Cause: unspecified-reason
Service_Affecting: True
Suppression: True
Management_Affecting_Severity: warning
Degrade_Affecting_Severity: none
270.101:
Type: Log
Description: "Host <host_name> compute services failure[, reason = <reason_text>]"
@ -1385,7 +1400,7 @@
OR
system.certificate.k8sRootCA
Severity: major
Proposed_Repair_Action: Check certificate expiration time. Renew certificate for the entity identified.
Proposed_Repair_Action: Renew certificate for the entity identified
Maintenance_Action:
Inhibit_Alarms:
Alarm_Type: operational-violation
@ -1412,7 +1427,7 @@
OR
system.certificate.k8sRootCA
Severity: critical
Proposed_Repair_Action: Check certificate expiration time. Renew certificate for the entity identified.
Proposed_Repair_Action: Renew certificate for the entity identified
Maintenance_Action:
Inhibit_Alarms:
Alarm_Type: operational-violation
@ -2773,8 +2788,7 @@
Description: "Application Apply Failure"
Entity_Instance_ID: k8s_application=<appname>
Severity: major
Proposed_Repair_Action: "Retry applying the application. Check application is managed by the system application framework.
If the issue persists, please check system inventory log for cause."
Proposed_Repair_Action: "Retry applying the application. If the issue persists, please check system inventory log for cause."
Maintenance_Action:
Inhibit_Alarms:
Alarm_Type: processing-error
@ -3069,10 +3083,10 @@
900.002:
Type: Alarm
Description: Patch host install failure. Command "sw-patch host-install" failed.
Description: Patch host install failure.
Entity_Instance_ID: host=<hostname>
Severity: major
Proposed_Repair_Action: Undo patching operation. Check patch logs on the target host (i.e. /var/log/patching.log)
Proposed_Repair_Action: Undo patching operation.
Maintenance_Action:
Inhibit_Alarms:
Alarm_Type: environmental
@ -3084,7 +3098,7 @@
900.003:
Type: Alarm
Description: A patch with state 'obsolete' in its metadata has been uploaded.
Description: Obsolete patch in system.
Entity_Instance_ID: host=controller
Severity: warning
Proposed_Repair_Action: Remove and delete obsolete patches.
@ -3099,7 +3113,7 @@
900.004:
Type: Alarm
Description: The upgrade and running software version do not match. Command host-upgrade failed.
Description: Host version mismatch.
Entity_Instance_ID: host=<hostname>
Severity: major
Proposed_Repair_Action: Reinstall host to update applied load.
@ -3174,7 +3188,7 @@
900.009:
Type: Alarm
Description: Kubernetes root CA update aborted, certificates may not be fully updated. Command "system kube-rootca-update-abort" has been run.
Description: Kubernetes root CA update aborted, certificates may not be fully updated.
Entity_Instance_ID: host=controller
Severity: minor
Proposed_Repair_Action: Fully update certificates by a new root CA update.
@ -3219,7 +3233,7 @@
900.103:
Type: Alarm
Description: Software patch auto-apply failed. Command "sw-manager patch-strategy apply" failed.
Description: Software patch auto-apply failed
Entity_Instance_ID: orchestration=sw-patch
Severity: critical
Proposed_Repair_Action: Attempt to apply software patches manually; if problem persists contact next level of support
@ -3363,7 +3377,7 @@
900.203:
Type: Alarm
Description: Software upgrade auto-apply failed. Command "sw-manager update-strategy apply" failed
Description: Software upgrade auto-apply failed
Entity_Instance_ID: orchestration=sw-upgrade
Severity: critical
Proposed_Repair_Action: Attempt to apply software upgrade manually; if problem persists contact next level of support
@ -3507,7 +3521,7 @@
900.303:
Type: Alarm
Description: Firmware Update auto-apply failed. Command "sw-manager kube-rootca-update-strategy apply" failed.
Description: Firmware Update auto-apply failed
Entity_Instance_ID: orchestration=fw-update
Severity: critical
Proposed_Repair_Action: Attempt to apply firmware update manually; if problem persists contact next level of support
@ -3795,7 +3809,7 @@
900.503:
Type: Alarm
Description: Kubernetes rootca update auto-apply failed. Command "sw-manager kube-upgrade-strategy apply" failed.
Description: Kubernetes rootca update auto-apply failed
Entity_Instance_ID: orchestration=kube-rootca-update
Severity: critical
Proposed_Repair_Action: Attempt to apply kubernetes rootca update manually; if problem persists contact next level of support