The hostwd process supports failure handling for two pmon
quorum failure modes.
1. persistent pmon quorum process failure
2. persistent absence of pmon's quorum health report
This update adds a new configuration option and associated
implementation required to force a crash dump action for
failure mode 2 above.
This means that if the Process Monitor itself gets stalled or stops
running for 3 (default config) minutes then the hostwd will trigger
a SysRq to force a crash dump.
Test Plan:
PASS: Verify kdump for pmon quorum health report message loss
PASS: Verify no kdump when kdump_on_stall is disabled
PASS: Verify handling when kdump service is not active
PASS: Verify sighup config change detection and handling
Regression:
PASS: Verify softdog timeout handling and logs
PASS: Verify quorum threshold config change and handling
PASS: Verify handling with reboot/reset recovery methods disabled
PASS: Verify enable reboot_on_err config change handling
PASS: Verify reboot/reset actions are ignored while host is locked
PASS: Verify pmon failure recovery handling before threshold reached
Change-Id: Id926447574e02013f83c0170784e2a8f9a46bac1
Partial-Bug: 1894889
Depends-On: https://review.opendev.org/#/c/750806
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>