metal/mtce-common/src/common
Eric MacDonald 14bb67789e Add pxeboot network mtcAlive messaging to Maintenance
The introduction of the new pxeboot network requires maintenance
verify and report on messaging failures over that network.

Towards that, this update introduces periodic mtcAlive messaging
between the mtcAgent and mtcClinet.

Test Plan:

PASS: Verify install and provision each system type with a mix
             of networking modes ; ethernet, bond and vlan
             - AIO SX, AIO DX, AIO DX plus
             - Standard System 2+1
             - Storage System 2+1+1
PASS: Verify feature with physical on management interface
PASS: Verify feature with vlan on management interface
PASS: Verify feature with bonded management interface
PASS: Verify feature with bonded vlans on management interface
PASS: Verify in bonded cases handling with 2, 1 or no slaves found
PASS: Verify mgmt-combined or separate cluster-host network
PASS: Verify mtcClient pxeboot interface address learning
             - for worker and storage nodes       ; dhcp leases file
             - for controller nodes before unlock ; dhcp leases file
             - for controller nodes after unlock  ; static from ifcfg
             - from controller within 10 seconds of process restart
PASS: Verify mtcAgent pxeboot interface address learning from
             dnsmasq.hosts file
PASS: Verify pxeboot mtcAlive initiation, handling, loss detection
             and recovery
PASS: Verify success and failure handling of all new pxeboot ip
             address learning functions ;
             - dhcp - all system node installs.
             - dnsmasq.hosts - active controller for all hosts.
             - interfaces.d - controller's mtcClient pxeboot address.
             - pxeboot req mtcAlive - mtcAgent mtcAlive request message.
PASS: Verify mtcClient pxeboot network 'mtcAlive request' and 'reboot'
             command handling for ethernet, vlan and bond configs.
PASS: Verify mtcAlive sequence number monitoring, out-of-sequence
             detection, handling and logging.
PASS: Verify pxeboot rx socket binding and non-blocking attribute
PASS: Verify mtcAgent handling stress soaking of sustained incoming
             500+ msgs/sec ; batch handling and logging.
PASS: Verify mtcAgent and mtcClient pxeboot tx and rx socket messaging,
             failure recovery handling and logging.
PASS: Verify pxeboot receiver is not setup on the oam interface on
             controller-0 first install until after initial config
             complete.

Regression:

PASS: Verify mtcAgent/mtcClient online and offline state management
PASS: Verify mtcAgent/mtcClient command handling
      - over management network
      - over cluster-host network
PASS: Verify mtcClient interface chain log for all iface types
      - bond    : vlan123 -> pxeboot0 (802.3ad 4) -> enp0s8 and enp0s9
      - vlan    : vlan123 -> enp0s8
      - ethernet: enp0s8
PASS: Verify mtcAgent/mtcClient handling and logging including debug
      logging for standard operations
      - node install and unlock
      - node lock and unlock
      - node reinstall, reboot, reset
PASS: Verify graceful recovery handling of heartbeat loss failure.
      - node reboot
      - management interface down
PASS: Verify systemcontroller and subcloud install with dc-libvirt
PASS: Verify no log flooding, coredumps, memory leaks

Story: 2010940
Task: 49541
Change-Id: Ibc87b85e3e0e07c3b8c40b5291bd3372506fbdfb
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2024-03-28 15:28:27 +00:00
..
Makefile Add redfish support detection to maintenance 2019-08-19 14:03:37 +00:00
alarmUtil.cpp Debian: Make Mtce offline handler more resilient to slow shutdowns 2022-10-24 15:57:43 +00:00
alarmUtil.h Failure case handling of LUKS service 2023-12-06 00:34:02 -05:00
bmcUtil.cpp Mtce: Add ActionInfo extension support for reset operations. 2022-10-13 17:40:05 +00:00
bmcUtil.h Mtce: Add ActionInfo extension support for reset operations. 2022-10-13 17:40:05 +00:00
fitCodes.h Add pxeboot network mtcAlive messaging to Maintenance 2024-03-28 15:28:27 +00:00
hostClass.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
hostClass.h Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
hostUtil.cpp Add support for peer controller reset via mtcClient 2021-01-14 16:44:14 -05:00
hostUtil.h Add support for peer controller reset via mtcClient 2021-01-14 16:44:14 -05:00
httpUtil.cpp Mtce: Fix bmc password fetch error handling 2022-06-01 15:21:05 +00:00
httpUtil.h Add a wait time between http request retries 2024-02-07 20:33:01 +00:00
ipmiUtil.cpp Add support for peer controller reset via mtcClient 2021-01-14 16:44:14 -05:00
ipmiUtil.h Add support for peer controller reset via mtcClient 2021-01-14 16:44:14 -05:00
jsonUtil.cpp Mtce: Add ActionInfo extension support for reset operations. 2022-10-13 17:40:05 +00:00
jsonUtil.h Remove all nova and libvirt files from mtce-common 2019-03-19 15:23:36 -05:00
keyClass.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
keyClass.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
logMacros.h Add pxeboot network mtcAlive messaging to Maintenance 2024-03-28 15:28:27 +00:00
msgClass.cpp Debian: Redfishtool requests fail when IPV4 address has square brackets 2022-10-06 22:21:38 +00:00
msgClass.h Debian: Redfishtool requests fail when IPV4 address has square brackets 2022-10-06 22:21:38 +00:00
nlEvent.cpp Fix heartbeat messaging when interface is set to 'lo' 2020-06-26 14:16:41 +00:00
nlEvent.h Refactor infrastructure network in mtce code 2019-04-18 09:32:41 -04:00
nodeBase.cpp Add pxeboot network mtcAlive messaging to Maintenance 2024-03-28 15:28:27 +00:00
nodeBase.h Add pxeboot network mtcAlive messaging to Maintenance 2024-03-28 15:28:27 +00:00
nodeEvent.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
nodeEvent.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
nodeMacro.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
nodeTimers.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
nodeTimers.h Improve maintenance power/reset control command retry handling 2024-01-25 22:42:26 +00:00
nodeUtil.cpp Add pxeboot network mtcAlive messaging to Maintenance 2024-03-28 15:28:27 +00:00
nodeUtil.h Add pxeboot network mtcAlive messaging to Maintenance 2024-03-28 15:28:27 +00:00
pingUtil.cpp Fix BMC access loss handling 2020-01-03 09:34:37 -05:00
pingUtil.h Fix BMC access loss handling 2020-01-03 09:34:37 -05:00
redfishUtil.cpp Mtce: Add ActionInfo extension support for reset operations. 2022-10-13 17:40:05 +00:00
redfishUtil.h Mtce: Add ActionInfo extension support for reset operations. 2022-10-13 17:40:05 +00:00
regexUtil.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
regexUtil.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
returnCodes.h Add pxeboot network mtcAlive messaging to Maintenance 2024-03-28 15:28:27 +00:00
secretUtil.cpp Mtce: Fix bmc password fetch error handling 2022-06-01 15:21:05 +00:00
secretUtil.h Mtce: Fix bmc password fetch error handling 2022-06-01 15:21:05 +00:00
threadUtil.cpp Improve mtcAgent interrupted thread cleanup 2021-03-15 10:51:16 -04:00
threadUtil.h Debian: Redfishtool requests fail when IPV4 address has square brackets 2022-10-06 22:21:38 +00:00
timeUtil.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
timeUtil.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
tokenUtil.cpp Remove references to ceilometer in maintenance 2019-04-30 14:28:12 -04:00
tokenUtil.h MTCE: reading BMC passwords from Barbican secret storage. 2019-02-14 09:04:46 -05:00