The mtcAgent process sometimes segfaults while trying to fetch
the bmc password from a failing barbican process.
With that issue fixed the mtcAgent sends the bmc access
credentials to the hardware monitor (hwmond) process which
then segfaults for a reason similar
In cases where the process does not segfault but also does not
get a bmc password, the mtcAgent will flood its log file.
This update
1. Prevents the segfault case by properly managing acquired
json-c object releases. There was one in the mtcAgent and
another in the hardware monitor (hwmond).
The json_object_put object release api should only be called
against objects that were created with very specific apis.
See new comments in the code.
2. Avoids log flooding error case by performing a password size
check rather than assume the password is valid following the
secret payload receive stage.
3. Simplifies the secret fsm and error and retry handling.
4. Deletes useless creation and release of a few unused json
objects in the common jsonUtil and hwmonJson modules.
Note: This update temporarily disables sensor and sensorgroup
suppression support for the debian hardware monitor while
a suppression type fix in sysinv is being investigated.
Test Plan:
PASS: Verify success path bmc password secret fetch
PASS: Verify secret reference get error handling
PASS: Verify secret password read error handling
PASS: Verify 24 hr provision/deprov success path soak
PASS: Verify 24 hr provision/deprov error path path soak
PASS: Verify no memory leak over success and failure path soaking
PASS: Verify failure handling stress soak ; reduced retry delay
PASS: Verify blocking secret fetch success and error handling
PASS: Verify non-blocking secret fetch success and error handling
PASS: Verify secret fetch is set non-blocking
PASS: Verify success and failure path logging
PASS: Verify all of jsonUtil module manages object release properly
PASS: Verify hardware monitor sensor model creation, monitoring,
alarming and relearning. This test requires suppress
disable in order to create sensor groups in debian.
PASS: Verify both ipmi and redfish and switch between them with
just bm_type change.
PASS: Verify all above tests in CentOS
PASS: Verify over 4000 provision/deprovision cycles across both
failure and success path handling with no process
failures
Closes-Bug: 1975520
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Change-Id: Ibbfdaa1de662290f641d845d3261457904b218ff
Trying to get the BMC password through barbican before
the ping succeeds leads to an early bmc access lost
failure that
1. produces a misleading bmc access lost failure log ;
bmc access had not even been established yet.
2. imposes as retry wait that delays re-establishing
bmc access and therefore overall sensor monitoring.
This update also
1. adds hostname to some of the secretUtil API
interfaces so that logs ar reported against the
correct host rather than always the current
controller hostname.
2. Changes some success path logging to dlogs to
reduce log noise.
3. simplifies a ping ok log
Change-Id: Ib3b7de212294d6dc350ee17d363f4009b3b0dcb0
Story: 2005861
Task: 36595
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Use Openstack Barbican API to retrieve BMC passwords stored by SysInv.
See SysInv commit for details on how to write password to Barbican.
MTCE is going to find corresponding secret by host uuid and retrieve
secret payload associated with it. mtcSecretApi_get is used to find
secret reference, based on a hostname. mtcSecretApi_read is used to
read a password using the reference found on a prevoius step.
Also, did a little cleanup and removed old unused token handling code.
Depends-On: I7102a9662f3757c062ab310737f4ba08379d0100
Change-Id: I66011dc95bb69ff536bd5888c08e3987bd666082
Story: 2003108
Task: 27700
Signed-off-by: Alex Kozyrev <alex.kozyrev@windriver.com>