StarlingX Bare Metal and Node Management, Hardware Maintenance
Go to file
Eric MacDonald 1196056612 Disable Redfish BMC audit and improve reinstall failure handling
The Mtce Reinstall Handler can collide with the BMC Redfish
audit resulting in reinstall failure. BMC handler's 2 minute
connection audit can colliding with other BMC commands.

The reinstall handler, with 4 bmc command operations is
particularly suseptable.

Two additional bmc communication improvements are implemented:

1. Add 'retry' handling to all BMC requests in the Maintenance
   Reinstall Handler FSM to handle transient command failures.

   Note: There are already retries to all but the power status
   query and the netboot requests in that handler and retries
   in other administrative commands that involve bmc requests.

2. Switch BMC power control command management from 'static' to
   'learned' lists. Some BMCs don't support both graceful and
   immediate power commands; Graceful Restart and Force Restart.
   To remove the possibility of using an unsupported BMC command,
   this update switches from static to learned power command lists
   with log produced if a server is missing command support.

   Power commands escalate from graceful to immediate in the
   presence of retries.

Test Cases:

PASS: Verify bmc handler redfish audit is disabled
PASS: Verify reinstall soak using redfish
PASS: Verify reinstall netboot and power status retry handling
PASS: Verify all power control commands using redfish
PASS: Verify graceful operations are used if available
PASS: Verify immediate operations are used for retries

Regression:

PASS: Verify bmc ping audit success and failure handling

PASS: Verify Reset        Handling soak (redfish and ipmi)
PASS: Verify Power-Off/On Handling soak (redfish and ipmi)
PASS: Verify Reinstall    Handling soak (redfish and ipmi)
PASS: Verify Standard System Install    (redfish and ipmi)
PASS: Verify AIO DX   System Install    (redfish and ipmi)

PASS: Verify this update as a patch

Change-Id: Idb484512ccb1b16e2d0ea9aff4ab7965347b1322
Closes-Bug: 1880578
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2020-11-16 15:15:22 +00:00
api-ref/source Switch to newer openstackdocstheme and reno versions 2020-06-04 14:32:46 +02:00
bsp-files Revert "Enable 'rcu_nocb_poll' kernel config option" 2020-10-27 13:53:04 +00:00
devstack Security: Handle nospectre_v1 in the bootargs 2020-01-28 18:21:13 -05:00
doc Switch to newer openstackdocstheme and reno versions 2020-06-04 14:32:46 +02:00
installer Remove reference to cgcs-centos-repo 2020-09-18 16:04:44 -04:00
kickstart Drop isolcpu from AIO/worker kickstarts 2020-06-19 02:08:28 -04:00
mtce Disable Redfish BMC audit and improve reinstall failure handling 2020-11-16 15:15:22 +00:00
mtce-common Disable Redfish BMC audit and improve reinstall failure handling 2020-11-16 15:15:22 +00:00
mtce-compute Add auto-versioning to starlingx/metal mtce packages 2020-05-21 15:18:43 -04:00
mtce-control Fix heartbeat messaging when interface is set to 'lo' 2020-06-26 14:16:41 +00:00
mtce-storage Add auto-versioning to starlingx/metal mtce packages 2020-05-21 15:18:43 -04:00
releasenotes Switch to newer openstackdocstheme and reno versions 2020-06-04 14:32:46 +02:00
tools/rvmc/centos Redfish Virtual Media Controller enhancements 2020-08-17 21:14:50 +00:00
.gitignore Update tox.ini files to use stein constraints 2019-06-25 13:20:35 -04:00
.gitreview OpenDev Migration Patch 2019-04-19 19:52:33 +00:00
.zuul.yaml Tox and Zuul job for the bandit code scan in starlingx/metal 2020-06-29 08:24:46 +00:00
CONTRIBUTORS.wrs StarlingX open source release updates 2018-05-31 07:36:43 -07:00
LICENSE StarlingX open source release updates 2018-05-31 07:36:43 -07:00
README.rst Followup opendev cleanup and test jobs 2019-04-22 16:42:03 +00:00
centos_build_layer.cfg Build layering, add layer build config file 2019-10-15 19:19:45 +08:00
centos_iso_image.inc Remove unused inventory and python-inventoryclient 2020-01-08 14:12:05 -06:00
centos_pkg_dirs rvmc: remove un-used build data 2020-01-16 08:39:54 -08:00
centos_stable_docker_images.inc Utility to install a server via Redfish 2019-12-31 15:34:54 +00:00
pylint.rc Add pylint checks for python files in metal 2020-01-03 13:27:00 -06:00
test-requirements.txt Tox and Zuul job for the bandit code scan in starlingx/metal 2020-06-29 08:24:46 +00:00
tox.ini Use newer flake8 to run on ubuntu-focal Zuul machines 2020-09-09 17:59:49 -04:00

README.rst

metal

StarlingX Bare Metal Management