metal/mtce/src
Jagatguru Prasad Mishra 1210ed450a Failure case handling of LUKS service
luks-fs-mgr service creates and unseals the LUKS volume used to store
keys/secrets. This change handles the failure case if this essential
service is inactive. It introduces an alarm LUKS_ALARM_ID which is
raised if service is inactive which implies that there is an issue in
creating or unsealing the LUKS volume.

Test Plan:
PASS" build-pkgs -c -p mtce-common
PASS: build-pkgs -c -p mtce
PASS: build-image
PASS: AIO-SX bootstrap with luks volume status active
PASS: AIO-DX bootstrap with volume status active
PASS: Standard setup with 2 controllers and 1 compute node with luks
      volume status active. There should not be any alarm and node
      status should be unlocked/enabled/available.
PASS: AIO-DX node enable failure on the controller where luks volume
      is inactive. Node availability should be failed. A critical
      alarm with id 200.016 should be displayed with 'fm alarm-list'
PASS: AIO-SX node enable failure on the controller-0. Node availability
      should be failed. A critical alarm with id 200.016 should be
      displayed with 'fm alarm-list'
PASS: Standard- node enable failure on the node (controller-0,
      controller-1, storage-0, compute-1). Node availability
      should be failed. A critical alarm with id 200.016 should be
      displayed with 'fm alarm-list' for the failed host.
PASS: AIO-DX In service volume inactive should be detected and a
      critical alarm should be raised with ID 200.016. Node
      availability should be changed to degraded.
PASS: AIO-SX In service volume inactive  status should be detected
      and a critical alarm should be raised with ID 200.016. Node
      availability should be changed to degraded.
PASS: Standard ( 2 controller, 1 storage, 1 compute) In service
      volume inactive status should be detected and a
      critical alarm should be raised with ID 200.016. Node
      availability should be changed to degraded.
PASS: AIO-DX In service: If volume becomes active and a LUKS alarm
      is active, alarm should be cleared. Node availability should
      be changed to available.
PASS: AIO-SX In service: If volume becomes active and a  LUKS alarm is
      active, alarm should be cleared. Node availability should be
      changed to available.
PASS: Standard ( 2 controller, 1 storage, 1 compute) In service:
      If volume becomes active and a LUKS alarm is active, alarm
      should be cleared. Node availability should be changed to
      available.
PASS: AIO-SX, AIO-DX, Standard- If intest fails and node availability
      is 'failed'. After fixing the volume issue, a lock/unlock should
      make the node available.

Story: 2010872
Task: 49108

Change-Id: I4621e7c546078c3cc22fe47079ba7725fbea5c8f
Signed-off-by: Jagatguru Prasad Mishra <jagatguruprasad.mishra@windriver.com>
2023-12-06 00:34:02 -05:00
..
alarm Failure case handling of LUKS service 2023-12-06 00:34:02 -05:00
common Failure case handling of LUKS service 2023-12-06 00:34:02 -05:00
fsmon Fix remaining failing mtce services on Debian 2022-01-25 12:10:39 -03:00
fsync Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
heartbeat Remove swerr log in hbsAgent cluster delete 2021-06-14 19:04:33 -04:00
hostw Change hostwd emergency log to write to /dev/kmsg 2023-02-01 23:41:14 +00:00
hwmon Re-enable sensor suppression support in Mtce Hardware Monitor 2022-08-06 00:02:29 +00:00
lmon Fix failing mtce services on Debian 2022-01-14 10:50:09 -03:00
maintenance Failure case handling of LUKS service 2023-12-06 00:34:02 -05:00
mtclog Set restricted permissions for mtce logfiles 2019-07-17 18:19:52 -04:00
pmon Fix bashate failure in zuul 2022-10-06 17:22:12 +00:00
public Fix mtce build error with gcc-8.2.1 2020-04-03 14:44:21 +08:00
scripts Failure case handling of LUKS service 2023-12-06 00:34:02 -05:00
LICENSE Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
Makefile Remove Resource Monitor ; aka rmon, from the load 2019-03-19 16:12:38 -04:00