Commit Graph

5 Commits

Author SHA1 Message Date
Eric MacDonald 4c541f50d4 Maintenance Redfish support useability enhancements.
This update is a result of changes made during a suite of
end-to-end provisioning, reprovisioning and deprovisioning
customer exterience testing of the maintenance RedFish support
feature.

1. Force reconnection and password fetch on provisioning changes
2. Force reconnection and password fetch on persistent connection failures
3. Fix redfish protocol learning (string compare) in hardware monitor
4. Improve logging for some typical error paths.

Test Plan:

PASS: Verify handling of reprovisioning BMC between hosts that support
             different protocols.
PASS: Verify handling of reprovisioning ip address to host that leads to a
             different protocol select.
PASS: Verify manual relearn handling to recover from errors that result from
             the above case.
PASS: Verify host BMC deprovisioning handling and cleanup.
PASS: Verify sensor monitoring.
PASS: Verify hwmond sticks with a selected protocol once a sensor model
             has been created using that protocol.
PASS: Verify handling of BMC reprovision - ip address change only
PASS: Verify handling of BMC reprovision - username change only
FAIL: Verify handling of BMC reprovision - password change only
             https://bugs.launchpad.net/starlingx/+bug/1846418

Change-Id: I4bf52a5dc3c97d7794ff623c881dff7886234e79
Closes-Bug: #1846212
Story: 2005861
Task: 36606
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2019-10-03 11:57:58 -04:00
Eric MacDonald df9343b0cc Add redfish power/reset/reinstall bmc support to maintenance
This update delivers redfish support for Power-On/Off, Reset
and Netboot Reinstall handling to maintenance.

Test Plan: (Testing Continues)

PASS: Verify Redfish Power-Off action handling
PASS: Verify Redfish Power-On action handling
PASS: Verify Redfish Reset action handling
PASS: Verify compute Redfish Reinstall action handling from controller-0
PASS: Verify compute Redfish Reinstall action handling from controller-1
PASS: Verify Redfish Power-Off Action failure handling
PASS: Verify Redfish Power-On action failure handling
PASS: Verify Redfish Reset action failure handling
PASS: Verify Redfish Re-Install action failure handling
PASS: verify Reset progression cycle does not leak memory.
PASS: Verify bmc_handler failure handling does not leak memory.
PASS: Verify Inservice BMC access (ping) failure and recovery handling.
PASS: Verify BMC access failure alarm handling
PASS: Verify BMC provisioning and deprovisioning soak (redfish - wolfpass)
PASS: Verify BMC provisioning and deprovisioning does not leak memory.
PASS: Verify BMC provisioning handling with bad ip and/or bad username
PASS: Verify BMC reprovisioning to same protocol
PASS: Verify BMC reprovisioning from ipmi host to redfish host
PASS: Verify BMC reprovisioning from redfish host to ipmi host
PASS: Verify mixed protocol support in same lab
PASS: Verify mixed server support in same lab
PASS: Verify Large System Install with BMCs provisioned (wp8-12)
PASS: Verify bmc access method (learn,ipmi,redfish) learned from mtc.init
PASS: Verify Swact with BMCs provisioned.
PASS: Verify no segfaults.
PASS: Verify AIO System Install in lab that supports redfish (WC3-6, WP8-12, Dell 720 3-7)
PASS: Verify AIO Simplex Install with Redfish Support (SM1, SM3)
PASS: Verify AIO Duplex Install with Redfish Support (SM 5-6, Dell 720 1-2

Useability:

PASS: Verify handling of reprovisioning BMC between hosts that support
             different protocols.
PASS: Verify handling of reprovisioning ip address to host that leads to a
             different protocol select.
PASS: Verify manual relearn handling to recover from errors that result from
             the above case.
PASS: Verify host BMC deprovisioning handling and cleanup.
PASS: Verify sensor monitoring.
PASS: Verify fault insertion for both protocols and action handling.
PASS: Verify protocol select handover.
PASS: Verify hwmond sticks with a selected protocol once a sensor model
             has been created using that protocol.
PASS: Verify handling of missing bmc_access_method configuration select.
PASS: Verify inservice bmc_access_method service parameter modification handling.

Regression:

PASS: Verify redfish BMC info query logging.
PASS: Verify sensor monitoring and alarming still works.
PASS: Verify all power/reset/netboot commands for IPMI
PASS: Verify reprovisioning soak of Wolfpass servers
PASS: Verify reprovisioning soak of SM servers

Depends-on: https://review.opendev.org/#/c/679178/
Change-Id: I984057e04d7426e37d675cf4d334a4e35419f2e8
Story: 2005861
Task: 35826
Task: 36606
Task: 36467
Task: 36456
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2019-09-26 15:59:35 -04:00
Eric MacDonald a4238c2a35 Add 50 byte hostname support to maintenance
Hosts with hostnames longer than 31 characters do not
go online (locked-disabled-online) after installation.

This update enhances maintenance messaging to support
up to 50 byte/character hostnames.

System Install:
---------------
PASS: Verify system install
PASS: Verify AIO system install (regression)
PASS: Verify system install with long hostnames, deployment-config.yaml
PASS: Verify mtcAgent process startup/restart logs
PASS: Verify hbsAgent process startup/restart logs (active controller)
PASS: Verify hbsAgent process startup/restart logs (standby controller)
PASS: Verify hwmond process startup/restart logs
PASS: Verify guestAgent process startup/restart logs
PASS: Verify all common maintenance daemons startup/restart logs
PASS: Verify patch applies and removes cleanly

PASS: Verify long hostname Add ; inventory distribution
PASS: Verify short hostname Add ; inventory distribution

Long Hostname Handling:
-----------------------
PASS: Verify host name support for up to 50 and 51 byte hostnames

Heartbeat Monitoring:
---------------------
PASS: Verify cluster-host interface link down handling.
PASS: Verify graceful recovery from host reboot.
PASS: Verify pmond process failure and recovery cycle.

Maintenance Actions:
--------------------
PASS: Verify host install with 50 byte hostname
PASS: Verify host lock
PASS: Verify host unlock
PASS: Verify host reboot
PASS: Verify host reinstall
PASS: Verify host delete (no core dump / all daemon logs)
PASS: Verify host power-off
PASS: Verify host power-on
PASS: Verify BMC State Info
PASS: Verify lock and unlock storage node
PASS: Controller Swact over and Back
PASS: Verify thresholded heartbeat failure handling
PASS: Verify node locked flag file
PASS: Verify no core dumps during testiong

Hardware Monitor:
-----------------
PASS: Verify BMC Provisioning/Reprovisioning/Deprovisioning
PASS: Verify Inventory Add/Delete/Modify
PASS: Verify Sensor Model and Monitoring
PASS: Verify Sensor Model Relearn
PASS: Verify Alarming and Logs
PASS: Verify Sensor Action, Interval modification
PASS: Verify Critical Sensor Action handling (ignore, log, alarm, reset, power cycle)

Guest Agent:
------------
PASS: Verify inventory add and delete

Process Monitor:
----------------
PASS: Verify process monitor logs
PASS: Verify process monitor events into mtcAgent
PASS: Verify process monitor failure alarming and recovery clear.
PASS: Verify process monitor regression script (test-pmon.sh -c restart)
PASS: Verify process monitor regression script (test-pmon.sh -c kill)
PASS: Verify process monitor regression script (test-pmon-action.sh)
PASS: Verify critical process failure handling
PASS: Verify major process failure handling

Collectd Monitoring:
-----------------
PASS: Verify collectd monitoring for long hostname hosts

Regression:
-----------
PASS: Verify mtce daemon sigal handling (test-signals.sh)

Change-Id: If22ab081397ec1e8b24f20aad8c99f8079cb98a5
Closes-Bug: 1824429
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2019-07-12 12:20:08 +00:00
Alex Kozyrev 506ef3fd7f MTCE: reading BMC passwords from Barbican secret storage.
Use Openstack Barbican API to retrieve BMC passwords stored by SysInv.
See SysInv commit for details on how to write password to Barbican.
MTCE is going to find corresponding secret by host uuid and retrieve
secret payload associated with it. mtcSecretApi_get is used to find
secret reference, based on a hostname. mtcSecretApi_read is used to
read a password using the reference found on a prevoius step.
Also, did a little cleanup and removed old unused token handling code.

Depends-On: I7102a9662f3757c062ab310737f4ba08379d0100
Change-Id: I66011dc95bb69ff536bd5888c08e3987bd666082
Story: 2003108
Task: 27700
Signed-off-by: Alex Kozyrev <alex.kozyrev@windriver.com>
2019-02-14 09:04:46 -05:00
Jim Gauld 6a5e10492c Decouple Guest-server/agent from stx-metal
This decouples the build and packaging of guest-server, guest-agent from
mtce, by splitting guest component into stx-nfv repo.

This leaves existing C++ code, scripts, and resource files untouched,
so there is no functional change. Code refactoring is beyond the scope
of this update.

Makefiles were modified to include devel headers directories
/usr/include/mtce-common and /usr/include/mtce-daemon.
This ensures there is no contamination with other system headers.

The cgts-mtce-common package is renamed and split into:
- repo stx-metal: mtce-common, mtce-common-dev
- repo stx-metal: mtce
- repo stx-nfv: mtce-guest
- repo stx-ha: updates package dependencies to mtce-pmon for
  service-mgmt, sm, and sm-api

mtce-common:
- contains common and daemon shared source utility code

mtce-common-dev:
- based on mtce-common, contains devel package required to build
  mtce-guest and mtce
- contains common library archives and headers

mtce:
- contains components: alarm, fsmon, fsync, heartbeat, hostw, hwmon,
  maintenance, mtclog, pmon, public, rmon

mtce-guest:
- contains guest component guest-server, guest-agent

Story: 2002829
Task: 22748

Change-Id: I9c7a9b846fd69fd566b31aa3f12a043c08f19f1f
Signed-off-by: Jim Gauld <james.gauld@windriver.com>
2018-09-18 17:15:08 -04:00