StarlingX High Availability/Process Monitoring/Service Management
Go to file
Eric MacDonald 23d0d8ab2f Add node locked gate to SM enable
Service Management (SM) sometimes selects and activates services on a
locked controller following a dead office recovery.

This update adds a node locked check to SM's enable handler to
block enable if present much like the existing goenabled check
blocks enable if not present in the same function.

The enable gate file is /etc/mtc/tmp/.node_locked on the local host.

Maintenance manages the presence or absence of this file based on
the node's administrative state.

This update also cleans up some extra whitespace in the changed file.

Test Plan:

PASS: Verify system build.
PASS: Verify AIO DX install.
PASS: Verify Standard DX system install with worker and storage.

For Both 'AIO DX' and 'Standard DX with worker and storage':

PASS: Verify SM does not activate on a locked controller.
PASS: ... DOR case
PASS: ... Uncontrolled Swact case
PASS: Verify Standard DX behavior over DOR with one locked controller
      while the only unlocked controller does not recover.
PASS: Verify behavior after above test case once the only unlocked
      controller does recover.
PASS: Verify lock of the standby controller and its sm logs
PASS: Verify manually creating the new Nv locked file on the active
      controller will cause SM to go disabled and shut down all
      services on that controller.
      ... If there is another unlocked controller then verify it
          takes over as an uncontrolled swact.
      ... If there is no unlocked standby controller then verify SM
          remains shutdown until the manually created Nv node locked
          file is removed. At which point SM proceeds to activate
          services on that controller again.

Regression:

PASS: Verify controlled swact with unlocked enabled standby.
PASS: Verify uncontrolled swact with unlocked enabled standby.
PASS: Verify standby controller lock/unlock soak loop (10).
PASS: Verify swact loop soak (10).
PASS: Verify no crash or core dumps.

Closes-Bug: 2051578
Change-Id: I0f0e3d199586513ddce484fdcc056e1b2562b45f
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2024-02-14 13:01:03 +00:00
api-ref/source Switch to newer openstackdocstheme and reno versions 2020-06-04 14:27:03 +02:00
devstack Remove sm-watchdog service since NFS is now stable 2022-08-19 19:57:43 +00:00
doc Fix tox-docs failing sphinx 2022-05-31 14:18:44 +00:00
releasenotes Switch to newer openstackdocstheme and reno versions 2020-06-04 14:27:03 +02:00
service-mgmt Add node locked gate to SM enable 2024-02-14 13:01:03 +00:00
service-mgmt-api Update debian package versions to use git commits 2023-02-10 10:14:48 -08:00
service-mgmt-client Update debian package versions to use git commits 2023-02-10 10:14:48 -08:00
service-mgmt-tools Merge "Add admin network support to SM" 2023-02-16 21:02:00 +00:00
stx-ocf-scripts Update debian package versions to use git commits 2023-02-10 10:14:48 -08:00
.gitignore [Doc] OpenStack API Reference Guide 2018-09-27 10:14:44 -07:00
.gitreview OpenDev Migration Patch 2019-04-19 19:52:24 +00:00
.zuul.yaml Fix github mirroring for this repo 2023-04-28 12:38:51 -04:00
CONTRIBUTORS.wrs StarlingX open source release updates 2018-05-31 07:36:26 -07:00
LICENSE StarlingX open source release updates 2018-05-31 07:36:26 -07:00
README.rst starlingx/ha README improvement 2023-07-19 12:28:24 -03:00
bindep.txt starlingx/ha README improvement 2023-07-19 12:28:24 -03:00
centos_build_layer.cfg Build layering, add layer build config file 2019-10-21 10:53:26 +08:00
centos_dev_wheels.inc Add sm-client-wheels to tarball 2019-11-14 10:55:52 -05:00
centos_iso_image.inc Config file changes to add 'stx-ocf-scripts ' after relocation from 'stx-upstream' 2019-09-04 15:59:21 -04:00
centos_pkg_dirs Remove version from sm folder 2019-09-26 14:11:31 -05:00
centos_stable_wheels.inc Add sm-client-wheels to tarball 2019-11-14 10:55:52 -05:00
debian_build_layer.cfg Add debian_build_layer.cfg file 2021-10-05 14:33:19 -04:00
debian_iso_image.inc Debian: fa: update debian_iso_image.inc 2022-11-16 12:01:26 +08:00
debian_pkg_dirs Add debian_pkg_dirs for ha 2021-10-27 18:59:20 +00:00
github_sync.trigger Verify upload to GitHub mirror with a new commit 2020-02-04 11:54:18 -05:00
pylint.rc Update zuul jobs from python2 to python3 2023-02-07 20:20:57 +00:00
test-requirements.txt Fix zuul errors due to changes in dependencies 2021-04-26 11:41:59 -04:00
tox.ini Fix zuul pep8 failures related to bugbear 2023-02-14 16:47:42 +00:00

README.rst

ha

The starlingx/ha repository handles High Availability services1.

Its key component is the StarlingX Service Management (SM), which coordinates the StarlingX services.

This repository is not intended to be developed standalone, but rather as part of the StarlingX Source System, which is defined by the StarlingX manifest2.

References


  1. https://docs.starlingx.io/api-ref/ha↩︎

  2. https://opendev.org/starlingx/manifest.git↩︎