StarlingX Bare Metal and Node Management, Hardware Maintenance
Go to file
Eric MacDonald 9bf231a286 Fix BMC access loss handling
Recent refactoring of the BMC handler FSM introduced a code change that
prevents the BMC Access alarm from being raised after initial BMC
accessibility was established and is then lost.

This update ensures BMC access alarm management is working properly.

This update also implements ping failure debounce so that a single ping
failure does not trigger full reconnection handling. Instead that now
requires 3 ping failures in a row. This has the effect of adding a minute
to ping failure action handling before the usual 2 minute BMC access failure
alarm is raised. ping failure logging is reduced/improved.

Test Plan: for both hwmond and mtcAgent

PASS: Verify BMC access alarm due to bad provisioning (un, pw, ip, type)
PASS: Verify BMC ping failure debounce handling, recovery and logging
PASS: Verify BMC ping persistent failure handling
PASS: Verify BMC ping periodic miss handling
PASS: Verify BMC ping and access failure recovery timing
PASS: Verify BMC ping failure and recovery handling over BMC link pull/plug
PASS: Verify BMC sensor monitoring stops/resumes over ping failure/recovery

Regression:

PASS: Verify IPv6 System Install using provisioned BMCs (wp8-12)
PASS: Verify BMC power-off request handling with BMC ping failing & recovering
PASS: Verify BMC power-on request handling with BMC ping failing & recovering
PASS: Verify BMC reset request handling with BMC ping failing & recovering
PASS: Verify BMC sensor group read failure handling & recovery
PASS: Verify sensor monitoring after ping failure handling & recovery

Change-Id: I74870816930ef6cdb11f987424ffed300ff8affe
Closes-Bug: 1858110
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2020-01-03 09:34:37 -05:00
api-ref/source Clean up and standardize landing pages 2019-01-09 09:34:38 -08:00
bsp-files ensure output format of 'subprocess.check_output' is same in Python2/3 2019-12-17 01:59:34 +00:00
devstack Add redfish support detection to maintenance 2019-08-19 14:03:37 +00:00
doc Fix the error links for metal docs 2019-07-03 09:20:25 -04:00
installer Configurable Host HTTP/HTTPS Port Binding 2019-02-06 16:04:07 -06:00
inventory Merge "fix the spelling mistakes" 2019-10-07 14:41:20 +00:00
kickstart Add openSUSE OBS Artifacts for Maintenance services 2019-09-20 09:18:54 -05:00
mtce Fix BMC access loss handling 2020-01-03 09:34:37 -05:00
mtce-common Fix BMC access loss handling 2020-01-03 09:34:37 -05:00
mtce-compute Update openSUSE OBS artifacts to build MTCE packages 2019-10-01 11:07:10 -05:00
mtce-control Update openSUSE OBS artifacts to build MTCE packages 2019-10-01 11:07:10 -05:00
mtce-storage Update openSUSE OBS artifacts to build MTCE packages 2019-10-01 11:07:10 -05:00
python-inventoryclient Update openSUSE OBS artifacts to build MTCE packages 2019-10-01 11:07:10 -05:00
releasenotes Update config for release notes to include project name 2019-02-05 14:14:17 -08:00
.gitignore Update tox.ini files to use stein constraints 2019-06-25 13:20:35 -04:00
.gitreview OpenDev Migration Patch 2019-04-19 19:52:33 +00:00
.zuul.yaml Turn off devstack as a zuul job 2019-10-17 12:58:08 -05:00
CONTRIBUTORS.wrs StarlingX open source release updates 2018-05-31 07:36:43 -07:00
LICENSE StarlingX open source release updates 2018-05-31 07:36:43 -07:00
README.rst Followup opendev cleanup and test jobs 2019-04-22 16:42:03 +00:00
centos_build_layer.cfg Build layering, add layer build config file 2019-10-15 19:19:45 +08:00
centos_iso_image.inc Remove Resource Monitor ; aka rmon, from the load 2019-03-19 16:12:38 -04:00
centos_pkg_dirs SysInv Decoupling: Create Inventory Service 2018-12-06 13:17:35 -05:00
test-requirements.txt pep8 job enable and fix pep8 reported issue 2018-09-06 09:45:51 +08:00
tox.ini Update tox.ini files to use stein constraints 2019-06-25 13:20:35 -04:00

README.rst

metal

StarlingX Bare Metal Management