Commit Graph

6 Commits

Author SHA1 Message Date
Eric MacDonald 9ab726b0eb Add support for peer controller reset via mtcClient
This update adds the ability for SM to passively
request the mtcClient to BMC reset its peer controller
as a means to recover a severely loaded active controller.

To do this the mtcAgent is modified keep the controllers'
mtcClients updated with the BMC info of its peer.

The mtcClient is modified to audit for the SM signal
and then when asserted issue a BMC reset of its peer
controller using ipmitool system call.

The ability to command the peer mtcCient to 'sync'
prior to the BMC reset is implemented but configured
disabled for now.

Change-Id: Ibe4c8aaa3a980cbe5f34c3e22f015698a6453c1a
Partial-Bug: #1895350
Co-Authored-By: Bin.Qian@windriver.com
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2021-01-14 16:44:14 -05:00
Dean Troyer 83101e95ba Add EXTRALDFLAGS to linker in a number of Makefiles
This allows DevStack plugins to add its configured STX_INST_DIR
to the linker search path.

Change-Id: I277204cd89767b93eec6c96969fc33d23e04516b
Signed-off-by: Dean Troyer <dtroyer@gmail.com>
2019-02-28 22:34:54 -06:00
Alex Kozyrev 506ef3fd7f MTCE: reading BMC passwords from Barbican secret storage.
Use Openstack Barbican API to retrieve BMC passwords stored by SysInv.
See SysInv commit for details on how to write password to Barbican.
MTCE is going to find corresponding secret by host uuid and retrieve
secret payload associated with it. mtcSecretApi_get is used to find
secret reference, based on a hostname. mtcSecretApi_read is used to
read a password using the reference found on a prevoius step.
Also, did a little cleanup and removed old unused token handling code.

Depends-On: I7102a9662f3757c062ab310737f4ba08379d0100
Change-Id: I66011dc95bb69ff536bd5888c08e3987bd666082
Story: 2003108
Task: 27700
Signed-off-by: Alex Kozyrev <alex.kozyrev@windriver.com>
2019-02-14 09:04:46 -05:00
Dean Troyer 58b987239f Set SHELL in Makefiles that use bash constructs
A number of Makefiles use '[[' in their test to set
STATIC_ANALYSIS_TOOL_EXISTS.  Set SHELL=/bin/bash

Change-Id: Ie9536d7cafd518f3e65acf38ac5b30aa7536ea79
Signed-off-by: Dean Troyer <dtroyer@gmail.com>
2018-12-07 14:09:48 -06:00
Eric MacDonald 8a223f395d Mtce: Add heartbeat cluster information for SM query
This part one of a two part HA Improvements feature that introduces
the collection of heartbeat health at the system level.

The full feature is intended to provide service management (SM)
with the last 2 seconds of maintenace's heartbeat health view that
is reflective of each controller's connectivity to each host
including its peer controller.

The heartbeat cluster summary information is additional information
for SM to draw on when needing to make a choice of which controller
is healthier, if/when to switch over and to ultimately avoid split
brain scenarios in a two controller system.

Feature Behavior: A common heartbeat cluster data structure is
introduced and published to the sysroot for SM. The heartbeat
service populates and maintains a local copy of this structure
with data that reflects the responsivness for each monitored
network of all the monitored hosts for the last 20 heartbeat
periods. Mtce sends the current cluster summary to SM upon request.

General flow of cluster feature wrt hbsAgent:

  hbs_cluster_init: general data init
  hbs_cluster_nums: set controller and network numbers
  forever:

    select:
      hbs_cluster_add / hbs_cluster_del: - add/del hosts from mtcAgent
      hbs_sm_handler -> hbs_cluster_send: - send cluster to SM

    heartbeating:
      hbs_cluster_append: add controller cluster to pulse request
      hbs_cluster_update: get controller cluster data from pulse responses
      hbs_cluster_save: save other controller cluster view in cluster vault
      hbs_cluster_log: log cluster state changes (clog)

Test Plan:

  PASS: Verify compute system install
  PASS: Verify storage system install
  PASS: Verify cluster data ; all members of structure
  PASS: Verify storage-0 state management
  PASS: Verify add of second controller
  PASS: Verify add of storage-0 node
  PASS: Verify behavior over Swact
  PASS: Verify lock/unlock of second controller ; overall behavior
  PASS: Verify lock/unlock of storage-0 ; overall behavior
  PASS: Verify lock/unlock of storage-1 ; overall behavior
  PASS: Verify lock/unlock of compute nodes ; overall behavior
  PASS: Verify heartbeat failure and recovery of compute node
  PASS: Verify heartbeat failure and recovery of storage-0
  PASS: Verify heartbeat failure and recovery of controller
  PASS: Verify delete of controller node
  PASS: Verify delete of storage-0
  PASS: Verify delete of compute node
  PASS: Verify cluster when controller-1 active / controller-0 disabled
  PASS: Verify MNFA and recovery handling
  PASS: Verify handling in presence of multiple failure conditions
  PASS: Verify hbsAgent memory leak soak test with continuous SM query.
  PASS: Verify active controller-1 infra network failure behavior.
  PASS: Verify inactive controller-1 infra network failure behavior.

Change-Id: I4154287f6dcf5249be5ab3180f2752ab47c5da3c
Story: 2003576
Task: 24907
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2018-10-05 22:47:17 +00:00
Jim Gauld 6a5e10492c Decouple Guest-server/agent from stx-metal
This decouples the build and packaging of guest-server, guest-agent from
mtce, by splitting guest component into stx-nfv repo.

This leaves existing C++ code, scripts, and resource files untouched,
so there is no functional change. Code refactoring is beyond the scope
of this update.

Makefiles were modified to include devel headers directories
/usr/include/mtce-common and /usr/include/mtce-daemon.
This ensures there is no contamination with other system headers.

The cgts-mtce-common package is renamed and split into:
- repo stx-metal: mtce-common, mtce-common-dev
- repo stx-metal: mtce
- repo stx-nfv: mtce-guest
- repo stx-ha: updates package dependencies to mtce-pmon for
  service-mgmt, sm, and sm-api

mtce-common:
- contains common and daemon shared source utility code

mtce-common-dev:
- based on mtce-common, contains devel package required to build
  mtce-guest and mtce
- contains common library archives and headers

mtce:
- contains components: alarm, fsmon, fsync, heartbeat, hostw, hwmon,
  maintenance, mtclog, pmon, public, rmon

mtce-guest:
- contains guest component guest-server, guest-agent

Story: 2002829
Task: 22748

Change-Id: I9c7a9b846fd69fd566b31aa3f12a043c08f19f1f
Signed-off-by: Jim Gauld <james.gauld@windriver.com>
2018-09-18 17:15:08 -04:00