StarlingX High Availability/Process Monitoring/Service Management
Go to file
Bin Qian a85ffc695e Shorten rabbit failure recovery delay
In rare cases, when system running slowly with significant scheduling
delay, rabbit disable action timeout continually. As final resort sm
reboots the impacted controller for recovery after failure count reaches
MAX_TRANSITION_FAILURES. As rabbit service disable timeout is set to 60
seconds, this result a significant delay before reboot for recovery.

This change updates MAX_TRANSITION_FAILURES of rabbit service from
16 to 5 to reduce the delay of recovery of rabbit failure.

TCs passed:
    Install a DX system
    Observed service group recovery escalated to reboot after 5 forced
    rabbit disable failure.

Closes-bug: 2016168
Signed-off-by: Bin Qian <bin.qian@windriver.com>
Change-Id: I660a64f0e78b6564456eb26245b672d2549f9a3b
2023-05-09 03:48:48 +00:00
api-ref/source Switch to newer openstackdocstheme and reno versions 2020-06-04 14:27:03 +02:00
devstack Remove sm-watchdog service since NFS is now stable 2022-08-19 19:57:43 +00:00
doc Fix tox-docs failing sphinx 2022-05-31 14:18:44 +00:00
releasenotes Switch to newer openstackdocstheme and reno versions 2020-06-04 14:27:03 +02:00
service-mgmt Shorten rabbit failure recovery delay 2023-05-09 03:48:48 +00:00
service-mgmt-api Update debian package versions to use git commits 2023-02-10 10:14:48 -08:00
service-mgmt-client Update debian package versions to use git commits 2023-02-10 10:14:48 -08:00
service-mgmt-tools Merge "Add admin network support to SM" 2023-02-16 21:02:00 +00:00
stx-ocf-scripts Update debian package versions to use git commits 2023-02-10 10:14:48 -08:00
.gitignore [Doc] OpenStack API Reference Guide 2018-09-27 10:14:44 -07:00
.gitreview OpenDev Migration Patch 2019-04-19 19:52:24 +00:00
.zuul.yaml Update zuul jobs from python2 to python3 2023-02-07 20:20:57 +00:00
CONTRIBUTORS.wrs StarlingX open source release updates 2018-05-31 07:36:26 -07:00
LICENSE StarlingX open source release updates 2018-05-31 07:36:26 -07:00
README.rst Followup opendev cleanup and test jobs 2019-04-21 14:31:33 -05:00
bindep.txt Update zuul jobs from python2 to python3 2023-02-07 20:20:57 +00:00
centos_build_layer.cfg Build layering, add layer build config file 2019-10-21 10:53:26 +08:00
centos_dev_wheels.inc Add sm-client-wheels to tarball 2019-11-14 10:55:52 -05:00
centos_iso_image.inc Config file changes to add 'stx-ocf-scripts ' after relocation from 'stx-upstream' 2019-09-04 15:59:21 -04:00
centos_pkg_dirs Remove version from sm folder 2019-09-26 14:11:31 -05:00
centos_stable_wheels.inc Add sm-client-wheels to tarball 2019-11-14 10:55:52 -05:00
debian_build_layer.cfg Add debian_build_layer.cfg file 2021-10-05 14:33:19 -04:00
debian_iso_image.inc Debian: fa: update debian_iso_image.inc 2022-11-16 12:01:26 +08:00
debian_pkg_dirs Add debian_pkg_dirs for ha 2021-10-27 18:59:20 +00:00
github_sync.trigger Verify upload to GitHub mirror with a new commit 2020-02-04 11:54:18 -05:00
pylint.rc Update zuul jobs from python2 to python3 2023-02-07 20:20:57 +00:00
test-requirements.txt Fix zuul errors due to changes in dependencies 2021-04-26 11:41:59 -04:00
tox.ini Fix zuul pep8 failures related to bugbear 2023-02-14 16:47:42 +00:00

README.rst

ha

StarlingX Service Management