2fc05673d1
The hostwd process supports failure handling for two pmon quorum failure modes. 1. persistent pmon quorum process failure 2. persistent absence of pmon's quorum health report This update adds a new configuration option and associated implementation required to force a crash dump action for failure mode 2 above. This means that if the Process Monitor itself gets stalled or stops running for 3 (default config) minutes then the hostwd will trigger a SysRq to force a crash dump. Test Plan: PASS: Verify kdump for pmon quorum health report message loss PASS: Verify no kdump when kdump_on_stall is disabled PASS: Verify handling when kdump service is not active PASS: Verify sighup config change detection and handling Regression: PASS: Verify softdog timeout handling and logs PASS: Verify quorum threshold config change and handling PASS: Verify handling with reboot/reset recovery methods disabled PASS: Verify enable reboot_on_err config change handling PASS: Verify reboot/reset actions are ignored while host is locked PASS: Verify pmon failure recovery handling before threshold reached Change-Id: Id926447574e02013f83c0170784e2a8f9a46bac1 Partial-Bug: 1894889 Depends-On: https://review.opendev.org/#/c/750806 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com> |
||
---|---|---|
api-ref/source | ||
bsp-files | ||
devstack | ||
doc | ||
installer | ||
kickstart | ||
mtce | ||
mtce-common | ||
mtce-compute | ||
mtce-control | ||
mtce-storage | ||
releasenotes | ||
tools/rvmc/centos | ||
.gitignore | ||
.gitreview | ||
.zuul.yaml | ||
CONTRIBUTORS.wrs | ||
LICENSE | ||
README.rst | ||
centos_build_layer.cfg | ||
centos_iso_image.inc | ||
centos_pkg_dirs | ||
centos_stable_docker_images.inc | ||
pylint.rc | ||
test-requirements.txt | ||
tox.ini |
README.rst
metal
StarlingX Bare Metal Management