metal

Commit Graph

Author	SHA1	Message	Date
Eric MacDonald	14bb67789e	Add pxeboot network mtcAlive messaging to Maintenance The introduction of the new pxeboot network requires maintenance verify and report on messaging failures over that network. Towards that, this update introduces periodic mtcAlive messaging between the mtcAgent and mtcClinet. Test Plan: PASS: Verify install and provision each system type with a mix of networking modes ; ethernet, bond and vlan - AIO SX, AIO DX, AIO DX plus - Standard System 2+1 - Storage System 2+1+1 PASS: Verify feature with physical on management interface PASS: Verify feature with vlan on management interface PASS: Verify feature with bonded management interface PASS: Verify feature with bonded vlans on management interface PASS: Verify in bonded cases handling with 2, 1 or no slaves found PASS: Verify mgmt-combined or separate cluster-host network PASS: Verify mtcClient pxeboot interface address learning - for worker and storage nodes ; dhcp leases file - for controller nodes before unlock ; dhcp leases file - for controller nodes after unlock ; static from ifcfg - from controller within 10 seconds of process restart PASS: Verify mtcAgent pxeboot interface address learning from dnsmasq.hosts file PASS: Verify pxeboot mtcAlive initiation, handling, loss detection and recovery PASS: Verify success and failure handling of all new pxeboot ip address learning functions ; - dhcp - all system node installs. - dnsmasq.hosts - active controller for all hosts. - interfaces.d - controller's mtcClient pxeboot address. - pxeboot req mtcAlive - mtcAgent mtcAlive request message. PASS: Verify mtcClient pxeboot network 'mtcAlive request' and 'reboot' command handling for ethernet, vlan and bond configs. PASS: Verify mtcAlive sequence number monitoring, out-of-sequence detection, handling and logging. PASS: Verify pxeboot rx socket binding and non-blocking attribute PASS: Verify mtcAgent handling stress soaking of sustained incoming 500+ msgs/sec ; batch handling and logging. PASS: Verify mtcAgent and mtcClient pxeboot tx and rx socket messaging, failure recovery handling and logging. PASS: Verify pxeboot receiver is not setup on the oam interface on controller-0 first install until after initial config complete. Regression: PASS: Verify mtcAgent/mtcClient online and offline state management PASS: Verify mtcAgent/mtcClient command handling - over management network - over cluster-host network PASS: Verify mtcClient interface chain log for all iface types - bond : vlan123 -> pxeboot0 (802.3ad 4) -> enp0s8 and enp0s9 - vlan : vlan123 -> enp0s8 - ethernet: enp0s8 PASS: Verify mtcAgent/mtcClient handling and logging including debug logging for standard operations - node install and unlock - node lock and unlock - node reinstall, reboot, reset PASS: Verify graceful recovery handling of heartbeat loss failure. - node reboot - management interface down PASS: Verify systemcontroller and subcloud install with dc-libvirt PASS: Verify no log flooding, coredumps, memory leaks Story: 2010940 Task: 49541 Change-Id: Ibc87b85e3e0e07c3b8c40b5291bd3372506fbdfb Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2024-03-28 15:28:27 +00:00
Eric MacDonald	191c0aa6a8	Add a wait time between http request retries Maintenance interfaces with sysinv, sm and the vim using http requests. Request timeout's have an implicit delay between retries. However, command failures or outright connection failures don't. This has only become obvious in mtce's communication with the vim where there appears to be a process startup timing change that leads to the 'vim' not being ready to handle commands before mtcAgent startup starts sending them after a platform services group startup by sm. This update adds a 10 second http retry wait as a configuration option to mtc.conf. The mtcAgent loads this value at startup and uses it in a new HTTP__RETRY_WAIT state of http request work FSM. The number of retries remains unchanged. This update is only forcing a minimum wait time between retries, regardless of cause. Failure path testing was done using Fault Insertion Testing (FIT). Test Plan: PASS: Verify the reported issue is resolved by this update. PASS: Verify http retry config value load on process startup. PASS: Verify updated value is used over a process -sighup. PASS: Verify default value if new mtc.conf config value is not found. PASS: Verify http connection failure http retry handling. PASS: Verify http request timeout failure retry handling. PASS: Verify http request operation failure retry handling. Regression: PASS: Build and install ISO - Standard and AIO DX. PASS: Verify http failures do not fail a lock operation. PASS: Verify host unlock fails if its http done queue shows failures. PASS: Verify host swact. PASS: Verify handling of random and persistent http errors involving the need for retries. Closes-Bug: 2047958 Change-Id: Icc758b0782be2a4f2882efd56f5de1a8dddea490 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2024-02-07 20:33:01 +00:00
Jagatguru Prasad Mishra	1210ed450a	Failure case handling of LUKS service luks-fs-mgr service creates and unseals the LUKS volume used to store keys/secrets. This change handles the failure case if this essential service is inactive. It introduces an alarm LUKS_ALARM_ID which is raised if service is inactive which implies that there is an issue in creating or unsealing the LUKS volume. Test Plan: PASS" build-pkgs -c -p mtce-common PASS: build-pkgs -c -p mtce PASS: build-image PASS: AIO-SX bootstrap with luks volume status active PASS: AIO-DX bootstrap with volume status active PASS: Standard setup with 2 controllers and 1 compute node with luks volume status active. There should not be any alarm and node status should be unlocked/enabled/available. PASS: AIO-DX node enable failure on the controller where luks volume is inactive. Node availability should be failed. A critical alarm with id 200.016 should be displayed with 'fm alarm-list' PASS: AIO-SX node enable failure on the controller-0. Node availability should be failed. A critical alarm with id 200.016 should be displayed with 'fm alarm-list' PASS: Standard- node enable failure on the node (controller-0, controller-1, storage-0, compute-1). Node availability should be failed. A critical alarm with id 200.016 should be displayed with 'fm alarm-list' for the failed host. PASS: AIO-DX In service volume inactive should be detected and a critical alarm should be raised with ID 200.016. Node availability should be changed to degraded. PASS: AIO-SX In service volume inactive status should be detected and a critical alarm should be raised with ID 200.016. Node availability should be changed to degraded. PASS: Standard ( 2 controller, 1 storage, 1 compute) In service volume inactive status should be detected and a critical alarm should be raised with ID 200.016. Node availability should be changed to degraded. PASS: AIO-DX In service: If volume becomes active and a LUKS alarm is active, alarm should be cleared. Node availability should be changed to available. PASS: AIO-SX In service: If volume becomes active and a LUKS alarm is active, alarm should be cleared. Node availability should be changed to available. PASS: Standard ( 2 controller, 1 storage, 1 compute) In service: If volume becomes active and a LUKS alarm is active, alarm should be cleared. Node availability should be changed to available. PASS: AIO-SX, AIO-DX, Standard- If intest fails and node availability is 'failed'. After fixing the volume issue, a lock/unlock should make the node available. Story: 2010872 Task: 49108 Change-Id: I4621e7c546078c3cc22fe47079ba7725fbea5c8f Signed-off-by: Jagatguru Prasad Mishra <jagatguruprasad.mishra@windriver.com>	2023-12-06 00:34:02 -05:00
Eric MacDonald	79d8644b1e	Add bmc reset delay in the reset progression command handler This update solves two issues involving bmc reset. Issue #1: A race condition can occur if the mtcAgent finds an unlocked-disabled or heartbeat failing node early in its startup sequence, say over a swact or an SM service restart and needs to issue a one-time-reset. If at that point it has not yet established access to the BMC then the one-time-reset request is skipped. Issue #2: When issue #1 race conbdition does not occur before BMC access is established the mtcAgent will issue its one-time reset to a node. If this occurs as a result of a crashdump then this one-time reset can interrupt the collection of the vmcore crashdump file. This update solves both of these issues by introducing a bmc reset delay following the detection and in the handling of a failed node that 'may' need to be reset to recover from being network isolated. The delay prevents the crashdump from being interrupted and removes the race condition by giving maintenance more time to establish bmc access required to send the reset command. To handle significantly long bmc reset delay values this update cancels the posted 'in waiting' reset if the target recovers online before the delay expires. It is recommended to use a bmc reset delay that is longer than a typical node reboot time. This is so that in the typical case, where there is no crashdump happening, we don't reset the node late in its almost done recovery. The number of seconds till the pending reset countdown is logged periodically. It can take upwards of 2-3 minutes for a crashdump to complete. To avoid the double reboot, in the typical case, the bmc reset delay is set to 5 minutes which is longer than a typical boot time. This means that if the node recovers online before the delay expires then great, the reset wasn't needed and is cancelled. However, if the node is truely isolated or the shutdown sequence hangs then although the recovery is delayed a bit to accomodate for the crashdump case, the node is still recovered after the bmc reset delay period. This could lead to a double reboot if the node recovery-to-online time is longer than the bmc reset delay. This update implements this change by adding a new 'reset send wait' phase to the exhisting reset progression command handler. Some consistency driven logging improvements were also implemented. Test Plan: PASS: Verify failed node crashdump is not interrupted by bmc reset. PASS: Verify bmc is accessible after the bmc reset delay. PASS: Verify handling of a node recovery case where the node does not come back before bmc_reset_delay timeout. PASS: Verify posted reset is cancelled if the node goes online before the bmc reset delay and uptime shows less than 5 mins. PASS: Verify reset is not cancelled if node comes back online without reboot before bmc reset delay and still seeing mtcAlive on one or more links.Handles the cluster-host only heartbeat loss case. The node is still rebooted with the bmc reset delay as backup. PASS: Verify reset progression command handling, with and without reboot ACKs, with and without bmc PASS: Verify reset delay defaults to 5 minutes PASS: Verify reset delay change over a manual change and sighup PASS: Verify bmc reset delay of 0, 10, 60, 120, 300 (default), 500 PASS: Verify host-reset when host is already rebooting PASS: Verify host-reboot when host is already rebooting PASS: Verify timing of retries and bmc reset timeout PASS: Verify posted reset throttled log countdown Failure Mode Cases: PASS: Verify recovery handling of failed powered off node PASS: Verify recovery handling of failed node that never comes online PASS: Verify recovery handling when bmc is never accessible PASS: Verify recovery handling cluster-host network heartbeat loss PASS: Verify recovery handling management network heartbeat loss PASS: Verify recovery handling both heartbeat loss PASS: Verify mtcAgent restart handling finding unlocked disabled host Regression: PASS: Verify build and DX system install PASS: Verify lock/unlock (soak 10 loops) PASS: Verify host-reboot PASS: Verify host-reset PASS: Verify host-reinstall PASS: Verify reboot graceful recovery (force and no force) PASS: Verify transient heartbeat failure handling PASS: Verify persistent heartbeat loss handling of mgmt and/or cluster networks PASS: Verify SM peer reset handling when standby controller is rebooted PASS: Verify logging and issue debug ability Closes-Bug: 2042567 Closes-Bug: 2042571 Change-Id: I195661702b0d843d0bac19f3d1ae70195fdec308 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2023-11-02 20:58:00 +00:00
Eric MacDonald	7539d36c3f	Prevent mtcClient from sending to uninitialized socket in AIO SX The mtcClient will perform a socket reinit if it detects a socket failure. The mtcClient also avoids setting up its controller-1 cluster network socket for the AIO SX system type ; because there is no controller-1 provisioned. Most AIO SX systems have the management/cluster networks set to the 'loopback' interface. However, when an AIO SX system is setup with its management and cluster networks on physical interfaces, with or without vlan, the mtcAlive send message utility will try to send to the uninitialized controller-1 cluster socket. This leads to a socket error that triggers a socket reinitialization loop which causes log flooding. This update adds a check to the mtcAlive send utility to avoid sending mtcAlive to controller-1 for AIO SX system type where there is no controller-1 provisioned; no send,no error,no flood. Since this update needed to add a system type check, this update also implemented a system type definition rename from CPE to AIO. Other related definitions and comments were also changed to make the code base more understandable and maintainable Test Plan: PASS: Verify AIO SX with mgmnt/clstr on physical (failure mode) PASS: Verify AIO SX Install with mgmnt/clstr on 'lo' PASS: Verify AIO SX Lock msg and ack over mgmnt and clstr PASS: Verify AIO SX locked-disabled-online state PASS: Verify mtcClient clstr socket error detect/auto-recovery (fit) PASS: Verify mtcClient mgmnt socket error detect/auto-recovery (fit) Regression: PASS: Verify AIO SX Lock and Unlock (lazy reboot) PASS: Verify AIO DX and DC install with pv regression and sanity PASS: Verify Standard system install with pv regression and sanity Change-Id: I658d33a677febda6c0e3fcb1d7c18e5b76cb3762 Closes-Bug: 1897334 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2021-04-21 10:20:10 -04:00
Eric MacDonald	031818e55b	Add in-service test to clear stale config failure alarm A configuration failure alarm can get stuck asserted if that node experiences an uncontrolled reboot that recovers without a configuration failure. This update adds an in-service test that audits host health while there is a configuration failure alarm raised and clear that alarm if the failure condition goes away. This could be a result of an in-service manifest that runs and corrects the configuration or if the node reboots and comes back up in a healthy (properly configured) state. Fixed bug that was clearing config alarm severity state when a heartbeat clear event is received. This update also goes a step further and introduces an alarms state audit that detects and corrects maintenance alarm state mismatches. Test Plan: PASS: Verify the add handler loads config alarm state PASS: Verify in-service test clears stale config alarm PASS: Verify in-service test acts on new config failure ... degrade - active controller ... fail - other hosts PASS: Verify audit fixes mtce alarm state mismatches PASS: Verify audit handles fm not running case PASS: Verify audit handling behavior with valid alarm cases PASS: Verify locked alarm management over process restart PASS: Verify audit only logs active alarms list changes PASS: Verify audit runs for both locked/unlocked nodes PASS: Verify update as a patch Regression: PASS: Verify enable sequence config failure handling PASS: ... active controller - recoverable degrade PASS: ... other nodes - threshold fail PASS: ... auto recovery disable - config failure PASS: Verify mtcAgent process logging PASS: Verify heartbeat handling and alarming PASS: Verify Standard system install PASS: Verify AIO system install Change-Id: If9957229810435e9faeb08374f2b5fbcb5b0f826 Closes-Bug: 1918195 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2021-03-29 16:39:52 -04:00
Eric MacDonald	9ab726b0eb	Add support for peer controller reset via mtcClient This update adds the ability for SM to passively request the mtcClient to BMC reset its peer controller as a means to recover a severely loaded active controller. To do this the mtcAgent is modified keep the controllers' mtcClients updated with the BMC info of its peer. The mtcClient is modified to audit for the SM signal and then when asserted issue a BMC reset of its peer controller using ipmitool system call. The ability to command the peer mtcCient to 'sync' prior to the BMC reset is implemented but configured disabled for now. Change-Id: Ibe4c8aaa3a980cbe5f34c3e22f015698a6453c1a Partial-Bug: #1895350 Co-Authored-By: Bin.Qian@windriver.com Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2021-01-14 16:44:14 -05:00
Eric MacDonald	2210c71216	Fix Mtce Heartbeat period recovery on MNFA Exit When Multi-Node Failure Avoidance (MNFA) occurs, maintenance commands the Heartbeat Agent to slow down by a factor of 4. The rate recovery following a MNFA is not occurring. Update https://review.opendev.org/#/c/701057 made a condition check change that introduced this issue by requiring mnfa_timeout to be non-zero before an attempt is made to recover heartbeat period following MNFA recovery. This update switches that condition check to use more specific mnfa_backoff state tracker and because MNFA is a global maintenance mode feature rather than a node specific feature, moves the recovery check code from the node level fsm into a mnfa_recovery_handler called in the main select loop. Test Plan: PASS: Verify MNFA handling/recovery with mnfa_timeout!=0 that expires. PASS: Verify MNFA handling/recovery when mnfa_timeout!=0 but before the timeout expires. PASS: Verify MNFA handling/recovery when mnfa_timeout=0 PASS: Verify MNFA backoff rate recovery over mtcAgent process restart. PASS: Verify MNFA backoff rate is sent to hbsAgent if hbsAgent restarts while MNFA his active. Change-Id: I8da5a000ab503692c7cfa620233ed8aa772c50f8 Closes-Bug: #1893212 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2020-09-18 01:34:11 +00:00
Eric MacDonald	4267d46786	Force heartbeat period reset on mtcAgent process startup In the case of a Multi Node Failure Avoidance (MNFA) event, mtcAgent (Maintenance) sends a 'back-off' request to the hbsAgent (Heartbeat) while there appears to be a networking issue that affects a number of hosts. This 'back-off' request tells the heartbeat service to slow down by a factor of 4; what was say a 100 ms period would change to a 400 ms period while in MNFA mode. When the MNFA condition resolves the mtcAgent sends a heartbeat 'recovery' command to the heartbeat service telling it to restore the heartbeat interval back the configured interval. However, if the mtcAgent process is 'restarted' while in MNFA mode, the knowledge that the heartbeat service was running at a reduced rate is lost and not restored. This update forces the heartbeat rate to be set back to the configured rate when the mtcAgent starts up. Note that MNFA mode is not and should not be preserved over mtcAgent process restart. If after restart a MNFA event. Change-Id: I254ef86c453cb2d40cbeda859bd7477ac28942bc Closes-Bug: 1884556 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2020-06-23 09:33:22 -04:00
Eric MacDonald	0826882308	Add mtcAgent socket initialization failure retry handling. The main maintenance process (mtcAgent) exits on a process start-up socket initialization failure. SM restarts the failed process within seconds and will swact if the second restart also fails. From startup to swact can be as quick as 4 seconds. This is too short to handle a collision with a manifest. This update adds a number of socket initialization retries to extend the time the process has to resolve socket initialization failures by giving the collided manifest time to complete between retries. The number of retries and inter retry wait time is calibrated to ensure that a persistently failing mtcAgent process exits in under 40 seconds. This is to ensure that SM is able to detect and swact away from a persistently failing maintenance process while also giving the process a few tries to resolve on its own. Test Plan: PASS: Verify socket init failure thresholded retry handling with no, persistent and recovered failure conditions. PASS: Verify swact if socket init failure is persistent PASS: Verify no swact if socket failure recovers after first exit PASS: Verify no swact if socket failure recovers over init retry PASS: Verify an hour long soak of continuous socket open/close retry Change-Id: I3cb085145308f0e920324e22111f40bdeb12b444 Closes-Bug: 1869192 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2020-04-01 19:24:22 +00:00
Eric MacDonald	c4b8171ddd	Refactor BMC provisioning in Maintenance The current mechanism used to preserve the learned bmc protocol in the filesystem on the active controller is problematic over swact. This update removes the file storage method in favor of preserving the learned protocol in the system inventory database as a key/value pair at the host level in already existing mtce_info database field. The specified or learned bmc access protocol is then shared with the hardware monitor through inter-daemon maintenance messaging. This update refactors bmc provisioning to accommodate bmc protocol selection at the host rather than system level. Towards that this update removes system level bmc_access_method selection in favor of host level selection through bm_type. A bm_type of 'bmc' specifies that the bmc access protocol for that host be learned. This has the effect of making it the same as what is delivered today but without support for changing it as the system level. A system inventory update will be delivered shortly that enables bmc access protocol selection at the host level. That update allows the customer to specify the bmc access protocol at the host level to be either dynamic (aka learned) or to only use 'redfish' or 'ipmi'. That system inventory update delivers that information to maintenance through bm_type via bmc provisioning. Until that update is delivered bm_type always comes in as 'bmc' which get interpreted as 'dynamic' to maintain existing configuration. The following additional issues were also fixed in this update. 1. The nodeTimers module defaults the 'ring' member of timers that are not running to false but should be true. 2. Added a pingUtil_restart function to facilitate quicker sensor monitoring following provisioning changes and bmc access failures. 3. Enhanced the hardware monitor sensor grouping filter to accommodate non-standard Redfish readout labelling so that more sensors fall into the existing canned groups ; leads to more monitored sensors. 4. Added a 'http security mode' to hardware monitor messaging. This defaults to https as that is all that is supported by the Redfish implementation today. This field can be used to specify non-secure 'http' mode in the future when that gets implemented. 5. Ensure the hardware monitor performs a bmc password re-fetch on every provisioning change. Test Plan: PASS: Verify bmc access protocol store/fetched from the database (mtce_info) PASS: Verify inventory push from mtcAgent to hwmond over mtcAgent restart PASS: Verify inventory push from mtcAgent to hwmond over hwmon restart PASS: Verify bmc provisioning of ipmi and redfish servers PASS: Verify learned bmc protocol persists over process restart and swact PASS: Verify process startup with protocol already learned Hardware Monitor: PASS: Verify bmc_type=ipmi handling ; protocol forced to ipmi ; (re)prov PASS: Verify bmc_type=redfish handling ; protocol forced to redfish ; (re)prov PASS: Verify bmc_type=dynamic handling ; protocol is learned then persisted PASS: Verify sensor model delete and relearn over ip address change PASS: Verify sensor model delete and relearn over bm_type change change PASS: Verify sensor model not relearned username change PASS: Verify bm pw is re-fetched over any (re)provisioning change PASS: Verify bmc re-provisioning soak (test-bmc-reprovisioning.sh 50 loops) PASS: Verify protocol change handling, file cleanup, model recreation PASS: Verify End-2-End behavior for bm_type change from redfish to ipmi PASS: Verify End-2-End behavior for bm_type change from ipmi to redfish PASS: Verify End-2-End behavior for bm_type change from redfish to dynamic PASS: Verify End-2-End behavior for bm_type change from ipmi to dynamic PASS: Verify End-2-End behavior for bm_type change from dynamic to ipmi PASS: Verify End-2-End behavior for bm_type change from dynamic to redfish PASS: Verify sensor model creation waits for server power to be on PASS: Verify sensor relearn by provisioning change during model creation. (soak) Regression: PASS: Verify host power off and on. PASS: Verify BMC access alarm handling (assert and clear) PASS: Verify mtcAgent and hwmond logs add value PASS: Verify no core dumps / seg faults. PASS: Verify no mtcAgent and hwmond memory leak. PASS: Verify delete of BMC provisioned host PASS: Verify sensor monitoring, alarming, degrade and then clear cycle PASS: Verify static analysis report of changed modules. PASS: Verify host level bm_type=bmc functions as would dynamic selection PASS: Verify batch provisioning and deprovisioning (7 nodes) PASS: Verify batch provisioning to different protocol (5 nodes) PASS: Verify handling of flaky Redfish responses PEND: Verify System Install Change-Id: Ic224a9c33e0283a611725b33c90009132cab3382 Closes-Bug: #1853471 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2019-12-09 09:39:49 -05:00
Eric MacDonald	675f49d556	Add mtcAgent support for sm_node_unhealthy condition When heartbeat over both networks fail, mtcAgent provides a 5 second grace period for heartbeat to recover before failing the node. However, when heartbeat fails over only one of the networks (management or cluster) the mtcAgent does not honour that 5 second grace period ; a bug. When it comes to peer controller heartbeat failure handling, SM needs that 5 second grace period to handle swact before mtcAgent declares the peer controller as failed, resets the node and updates the database. This update implements a change that forces a 2 second wait time between each fast enable and fixes the fast enable threshold count to be the intended 3 retries. This ensures that at least 5 seconds, actually 6 in the case of single network heartbeat loss, passes before declaring the node as failed. In addition to that, a special condition is added to detect and stop work if the active controller is sm_node_unhealthy. We don't want mtcAgent to make any database updates while in this failure mode. This gives SM the time to handle the failure according to the system's controllers' high availability handling feature. Test Plan: PASS: Verify mtcAgent behavior on set and clear of SM node unhealthy state. PASS: Verify SM has at least 5 seconds to shut down mtcAgent when heartbeat to peer controller fails for one or both networks. PASS: Test real case scenario with link pull. PASS: Verify logging in presence of real failure condition. Change-Id: I8f8d6688040fe899aff6fc40aadda37894c2d5e9 Closes-Bug: 1847657 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2019-10-15 15:24:34 -04:00
Eric MacDonald	df9343b0cc	Add redfish power/reset/reinstall bmc support to maintenance This update delivers redfish support for Power-On/Off, Reset and Netboot Reinstall handling to maintenance. Test Plan: (Testing Continues) PASS: Verify Redfish Power-Off action handling PASS: Verify Redfish Power-On action handling PASS: Verify Redfish Reset action handling PASS: Verify compute Redfish Reinstall action handling from controller-0 PASS: Verify compute Redfish Reinstall action handling from controller-1 PASS: Verify Redfish Power-Off Action failure handling PASS: Verify Redfish Power-On action failure handling PASS: Verify Redfish Reset action failure handling PASS: Verify Redfish Re-Install action failure handling PASS: verify Reset progression cycle does not leak memory. PASS: Verify bmc_handler failure handling does not leak memory. PASS: Verify Inservice BMC access (ping) failure and recovery handling. PASS: Verify BMC access failure alarm handling PASS: Verify BMC provisioning and deprovisioning soak (redfish - wolfpass) PASS: Verify BMC provisioning and deprovisioning does not leak memory. PASS: Verify BMC provisioning handling with bad ip and/or bad username PASS: Verify BMC reprovisioning to same protocol PASS: Verify BMC reprovisioning from ipmi host to redfish host PASS: Verify BMC reprovisioning from redfish host to ipmi host PASS: Verify mixed protocol support in same lab PASS: Verify mixed server support in same lab PASS: Verify Large System Install with BMCs provisioned (wp8-12) PASS: Verify bmc access method (learn,ipmi,redfish) learned from mtc.init PASS: Verify Swact with BMCs provisioned. PASS: Verify no segfaults. PASS: Verify AIO System Install in lab that supports redfish (WC3-6, WP8-12, Dell 720 3-7) PASS: Verify AIO Simplex Install with Redfish Support (SM1, SM3) PASS: Verify AIO Duplex Install with Redfish Support (SM 5-6, Dell 720 1-2 Useability: PASS: Verify handling of reprovisioning BMC between hosts that support different protocols. PASS: Verify handling of reprovisioning ip address to host that leads to a different protocol select. PASS: Verify manual relearn handling to recover from errors that result from the above case. PASS: Verify host BMC deprovisioning handling and cleanup. PASS: Verify sensor monitoring. PASS: Verify fault insertion for both protocols and action handling. PASS: Verify protocol select handover. PASS: Verify hwmond sticks with a selected protocol once a sensor model has been created using that protocol. PASS: Verify handling of missing bmc_access_method configuration select. PASS: Verify inservice bmc_access_method service parameter modification handling. Regression: PASS: Verify redfish BMC info query logging. PASS: Verify sensor monitoring and alarming still works. PASS: Verify all power/reset/netboot commands for IPMI PASS: Verify reprovisioning soak of Wolfpass servers PASS: Verify reprovisioning soak of SM servers Depends-on: https://review.opendev.org/#/c/679178/ Change-Id: I984057e04d7426e37d675cf4d334a4e35419f2e8 Story: 2005861 Task: 35826 Task: 36606 Task: 36467 Task: 36456 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2019-09-26 15:59:35 -04:00
Eric MacDonald	804ec52227	Add redfish support detection to maintenance This update 1. Refactors some of the common maintenance ipmi definitions and utilities into a more generic 'bmcUtil' module to reduce code duplication and improve improve code reuse with the introduction of a second bmc communication protocol ; redfish. 2. Creates a new 'redFishUtil' module similar to the existing 'ipmiUtil' module but in support of common redfish utilities and definitions that can be used by both maintenance and the hardware monitor. 3. Moves the existing 'mtcIpmiUtil' module to a more common 'mtcBmcUtil' and renames the 'ipmi_command_send/recv' to the more generic 'bmc_command_send/recv' which are enhanced to support both ipmi and redfish bmc communication methods. 4. Renames the bmc info collection and connection monitor ; 'bm_handler' to 'bmc_handler' and adds support necessary to learn if a host's bmc supports redfish. 5. Renames the existing 'mtcThread_ipmitool' to a more common 'mtcThread_bmc' and redfishtool support for the now common set of bmc thread commands and the addition of the new redfishtool bmc query, aka 'redfish root query', used to detect if a host's bmc supports redfish. Note: This aspect is the primary feature of this update. Namely the ability to detect and print a log indicating if a host's bmc supports redfish. Test Plan: PASS: Verify sensor monitoring and alarming still works. PASS: Verify power-off command handling. PASS: Verify power-on command handling. PASS: Verify reset command handling. PASS: Verify reinstall (netboot) command handling. PASS: Verify logging when redfish is not supported. PASS: Verify logging when redfish is supported. PASS: Verify ipmitool is used regardless of redfish support. PASS: Verify mtce thread error handling for both protocols. Change-Id: I72e63958f61d10f5c0d4a93a49a7f39bdd53a76f Story: 2005861 Task: 35825 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2019-08-19 14:03:37 +00:00
Eric MacDonald	62532a7eac	Fix maintenance cluster-host messaging Maintenance's success path messaging does not depend on cluster network messaging. However, there are a number of failure mode cases that do depend on cluster network messaging to properly diagnose and offer a higher availability handling for some failure cases. For instance, when the management interface goes down, without cluster network messaging remote hosts can be isolated. Being able to command- reboot a host over cluster-host network offers higher availability. Maintenance is designed to use the cluster network, if provisioned, as a backup path for mtcAlive, node locked, reboot and several other commands and acknowledgements. Unfortunately, it was recently observed that maintenance is using the 'nfs-controller' label to resolve cluster network addressing which resolves to management network IPs. As a result all messages intended to be going over the cluster-host network are instead just redundant management network messages. During debug of this issue several additional cluster network messaging related issues were observed and fixed. This update implements the following fixes 1. since there is no floating address for the cluster network the mtcClient was modified to send messages to both controllers where only the active controller will be listening and acting. 2. fixes port number mtce listens for cluster-host network messages 3. fixes port number mtce sends cluster-host network messages to. 4. mtcAlive messages are also sent on provisioned cluster network. 5. locked state notifications and acks sent on provisioned cluster network. 6. reboot request and acks sent on provisioned cluster network. 7. fixed command acknowledgement messaging. This update also 1. envelopes the mtcAlive gate control to allow debug tracing of all gate state changes. 2. moves graceful recovery handling heartbeat failure state clear to the end of the recovery handler, just before heartbeat start. 3. adds sm unhealthy support to fail and automatically recover the inactive controller from an SM UNHEALTHY state. ---------- Test Plan: ---------- Functional: PASS: Verify management network messaging PASS: Verify cluster-host network messaging PASS: Verify cluster-host messages with tcpdump PASS: Verify cluster-host network mtcAlive messaging PASS: Verify reboot request and ack reply over management network PASS: Verify reboot request and ack reply over cluster-host network PASS: Verify lock state notification and ack reply over management network PASS: Verify lock state notification and ack reply over cluster-host network PASS: Verify acknowledgement messaging PASS: Verify maintenance daemon logging PASS: Verify maintenance socket initialization System: PASS: Verify compute system install PASS: Verify AIO system install Feature: PASS: Verify sm node unhealth handling (active:ignore, inactive:recover) Change-Id: I092596d3e22438dd8a613a073614c188f6f5721d Closes-Bug: #835268 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2019-07-18 14:54:45 -04:00
Eric MacDonald	a4238c2a35	Add 50 byte hostname support to maintenance Hosts with hostnames longer than 31 characters do not go online (locked-disabled-online) after installation. This update enhances maintenance messaging to support up to 50 byte/character hostnames. System Install: --------------- PASS: Verify system install PASS: Verify AIO system install (regression) PASS: Verify system install with long hostnames, deployment-config.yaml PASS: Verify mtcAgent process startup/restart logs PASS: Verify hbsAgent process startup/restart logs (active controller) PASS: Verify hbsAgent process startup/restart logs (standby controller) PASS: Verify hwmond process startup/restart logs PASS: Verify guestAgent process startup/restart logs PASS: Verify all common maintenance daemons startup/restart logs PASS: Verify patch applies and removes cleanly PASS: Verify long hostname Add ; inventory distribution PASS: Verify short hostname Add ; inventory distribution Long Hostname Handling: ----------------------- PASS: Verify host name support for up to 50 and 51 byte hostnames Heartbeat Monitoring: --------------------- PASS: Verify cluster-host interface link down handling. PASS: Verify graceful recovery from host reboot. PASS: Verify pmond process failure and recovery cycle. Maintenance Actions: -------------------- PASS: Verify host install with 50 byte hostname PASS: Verify host lock PASS: Verify host unlock PASS: Verify host reboot PASS: Verify host reinstall PASS: Verify host delete (no core dump / all daemon logs) PASS: Verify host power-off PASS: Verify host power-on PASS: Verify BMC State Info PASS: Verify lock and unlock storage node PASS: Controller Swact over and Back PASS: Verify thresholded heartbeat failure handling PASS: Verify node locked flag file PASS: Verify no core dumps during testiong Hardware Monitor: ----------------- PASS: Verify BMC Provisioning/Reprovisioning/Deprovisioning PASS: Verify Inventory Add/Delete/Modify PASS: Verify Sensor Model and Monitoring PASS: Verify Sensor Model Relearn PASS: Verify Alarming and Logs PASS: Verify Sensor Action, Interval modification PASS: Verify Critical Sensor Action handling (ignore, log, alarm, reset, power cycle) Guest Agent: ------------ PASS: Verify inventory add and delete Process Monitor: ---------------- PASS: Verify process monitor logs PASS: Verify process monitor events into mtcAgent PASS: Verify process monitor failure alarming and recovery clear. PASS: Verify process monitor regression script (test-pmon.sh -c restart) PASS: Verify process monitor regression script (test-pmon.sh -c kill) PASS: Verify process monitor regression script (test-pmon-action.sh) PASS: Verify critical process failure handling PASS: Verify major process failure handling Collectd Monitoring: ----------------- PASS: Verify collectd monitoring for long hostname hosts Regression: ----------- PASS: Verify mtce daemon sigal handling (test-signals.sh) Change-Id: If22ab081397ec1e8b24f20aad8c99f8079cb98a5 Closes-Bug: 1824429 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2019-07-12 12:20:08 +00:00
Eric MacDonald	1011fd8a1a	Add network boot support to mtce reinstall handling The current maintenance 'reinstall' handling requires a host to be booted and online in order to perform a reinstall by asking the mtcClient to wipe the disks and self reboot thereby forcing a network boot and reinstall. This re-install process is problematic for hosts that don't install properly and never come online or on new system installs where the existing boot image on disk is still valid ; local disk as the first boot device. Getting around these issues prior to this update requires manual BIOS intervention to force-select a network boot. This update continues to support the online-wipedisk method for hosts that are not BMC provisioned and adds offline reinstall support through IPMI commands for hosts that are BMC provisioned. For hosts that have the BMC provisioned, the re-install handler will wait up to 10 minutes for maintenance to to establish connectivity to the BMC if it has not already. Then it will issue a devboot pxe IPMI command to tell the BMC to boot from the network on the 'next' reset and then maintenance proceeds to reset that host by a second IPMI command. This way the host will boot from the network and perform a local install even if the current image on disk is valid. No manual BIOS actions required. This update requires a small system inventory update to relax the online requirement for BMC provisioned hosts so that the reinstall to proceed. That update depends on this. This update also does some minor cleanup in the unused mtcAgent test head to fix a static analysis error. Test Plan: With BMC Test Cases: Success ---------------------------- PASS: Verify install requiring power on with valid image on disk ; pass case PASS: Verify install while powered on but offline with invalid image on disk ; pass case PASS: Verify install while powered on but offline with valid image on disk ; pass case PASS: Verify install with UEFI boot PASS: Verify BMC Reinstall on Dell (720) PASS: Verify BMC Reinstall on WC PASS: Verify BMC Reinstall on HP (hp380) PEND: Verify BMC Reinstall on SM PEND: Verify BMC Reinstall on WP COND: Verify install Secure boot - 430 1-2 fails With BMC Test Cases: Failure --------------------------- PASS: Verify reinstall handling during install during online wait ; restarts the install PASS: Verify reinstall handling during install before online wait ; no install interruption PASS: Verify BMC not accessible at ReInstall start ; recovery PASS: Verify BMC not accessible at ReInstall start ; timeout PASS: Verify BMC accessibility loss over Install process PASS: Verify netboot request failure handling ; no/bad response ; max retry PASS: Verify reset request failure handling ; no retries PASS: Verify BMC de-provisioning over install ; failure handling PASS: Verify BMC re-provisioning over install ; BMC initially not accessible PASS: Verify BMC re-provisioning over install ; BMC initially accessible PASS: Verify install requiring power on but gets power-on receive failure PASS: Verify install requiring power on but gets power-on request failure No BMC Test Cases: Success -------------------------- PASS: Verify install when host is powered on and online No BMC Test Cases: Failure -------------------------- PASS: Verify reinstall action handling during reinstall ; no install interruption PASS: Verify install when host is powered off ; install fails PASS: Verify install when host is powered on and offline ; install fails Regression: ----------- PASS: Verify host reset PASS: Verify host power-off PASS: Verify host power-on PASS: Verify host sensor model and monitoring Change-Id: Ic8c8232167c570e4f75c0bbe1604697966157184 Story: 2005650 Task: 30935 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2019-05-23 18:30:04 -04:00
Teresa Ho	8e51a1660a	Refactor infrastructure network in mtce code Updated to read the host cluster-host parameter in /etc/hosts file. Replaced references of infra network with cluster-host network Story: 2004273 Task: 29473 Change-Id: I199fb82e5f6b459b181196d0802f1a74220b796e Signed-off-by: Teresa Ho <teresa.ho@windriver.com>	2019-04-18 09:32:41 -04:00
wanghao	9427e74cf5	Fix the duplicated check of float_ip In _self_provision function, we should check the local_ip and float_ip, but it checks the float_ip twice, we should modify one of them to local_ip. Signed-off-by: wanghao <sxmatch1986@gmail.com> Closes-Bug: #1823012 Change-Id: I9bbee989ea48d0bef6fc663e7e168216321c7779	2019-04-03 21:44:27 +08:00
Alex Kozyrev	506ef3fd7f	MTCE: reading BMC passwords from Barbican secret storage. Use Openstack Barbican API to retrieve BMC passwords stored by SysInv. See SysInv commit for details on how to write password to Barbican. MTCE is going to find corresponding secret by host uuid and retrieve secret payload associated with it. mtcSecretApi_get is used to find secret reference, based on a hostname. mtcSecretApi_read is used to read a password using the reference found on a prevoius step. Also, did a little cleanup and removed old unused token handling code. Depends-On: I7102a9662f3757c062ab310737f4ba08379d0100 Change-Id: I66011dc95bb69ff536bd5888c08e3987bd666082 Story: 2003108 Task: 27700 Signed-off-by: Alex Kozyrev <alex.kozyrev@windriver.com>	2019-02-14 09:04:46 -05:00
Eric MacDonald	3a5c578355	Mtce: Add Thresholded Maintenance Enable Recovery support This update stops trying to recover hosts that have failed the Enable sequence after a thresholded number of back-to-back tries. A host that has reached a particular failure modes' max failure threshold then maintenance puts it into a 'unlocked-disabled-failed' state and left that way with no further recovery action until it is manually locked and unlocked. The thresholded Enable failure causes are Configuration Failure ....... threshold:2 retry interval:30 secs In-Test GoEnabled Failure ... threshold:2 retry interval:30 sec Start Host Services Failure . threshold:2 retry interval:30 sec Heartbeat Soak Failure ...... threshold:2 retry interval:10 minute This update refactors the old auto recovery for AIO SX into this more generic framework. Story: 2003576 Task: 24905 Test Plan: PASS: Verify AIO DX System Install PASS: Verify AIO SX DOR PASS: Verify Auto recovery disabled state is maintained over AIO SX DOR PASS: Verify Lock/Unlock recovers host from Auto recovery disabled state PASS: Verify AIO SX Main Config Failure handling PASS: Verify AIO SX Main Config Timeout handling PASS: Verify AIO SX Main GoEnabled Failure Handling PASS; Verify AIO SX Main Host Services Failure handling PASS; Verify AIO SX Main Host Services Timeout handling PASS; Verify AIO SX Subf Config Failure handling PASS: Verify AIO SX Subf Config Timeout handling PASS: Verify AIO SX Subf GoEnabled Failure Handling PASS: Verify AIO SX Subf Host Services Failure handling PASS: Verify AIO DX System Install PASS: Verify AIO DX DOR PASS: Verify AIO DX DOR ; one time active controller GoEnabled failure ; swact requested PASS: Verify AIO DX Main First Unlock Failure handling PASS: Verify AIO DX Main Config Failure handling (inactive ctrl) PASS: Verify AIO DX Main one time Config Failure handling PASS: Verify AIO DX Main one time GoEnabled Failure handling. PASS: Verify AIO DX SUBF Inactive Controller 1 GoEnable Failure handling. PASS: Verify AIO DX Inactive Controller 1 GoEnable Failure with recovery on retry. PASS: Verify AIO DX Active controller Enable failure with no or locked peer controller. PASS: Verify AIO DX Reboot Active controller with peer in auto recovery disabled state. PASS: Verify AIO DX Active controller failure with peer in auto recovery disabled state. (vswitch process) PASS: Verify AIo DX Active controller failure then recovery after reboot with peer in auto recovery disabled state. (goenabled) PASS: Verify AIO DX Inactive Controller Enable Heartbeat Soak Failure handling. PASS: Verify AIO DX Active controller unhealthy detection and handling. (degrade) PASS: Verify AIO DX Inactive controller unhealthy detection and handling. (fail) PASS: Verify Normal System Install PASS: Verify Compute Enable Configuration Failure handling (wc71-75) PASS: Verify Compute Enable GoEnabled Failure handling (recover after 1) PASS: Verify Compute Enable Start Host Services Failure handling PASS: Verify Compute Enable Heartbeat Soak Failure handling PASS: Verify Inactive Controller Enable Heartbeat Soak Failure handling PASS: Verify Inactive Controller Configuration Failure handling PASS; Verify Inactive Controller GoEnabled Failure handling PASS; Verify Inactive Controller Host Services Failure handling PASS; Verify goEnabled failure after active controller reboot with no peer controller (C0 rebooted with C1 locked) - no SM startup PASS: Verify auto recovery threshold number is configurable PASS: Verify auto recovery retry interval is configurable PASS: Verify auto recovery host state and status message Regression: PASS: Verify Swact behavior, over and back PASS: Verify 5 node DOR PASS: Verify 3 host MNFA behavior PASS: verify in-service heartbeat failure handling PASS: verify no segfaults during UT Corner Cases: PASS: Verify mtcAlive boot failure behavior. reset progression. retry forever. - sleep in config script PASS: Verify AIO SX mtcAgent process restart while in autorecovery disabled state PASS: Verify autorecovery disabled state is preserved over mtcAgent process restart. Change-Id: I7098f16243caef27c5295971ef3c9de5be975755 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2018-12-12 08:11:36 -05:00
Eric MacDonald	0b922227ac	Implement Active-Active Heartbeat as HA Improvement This update introduces mtce changes to support Active-Active Heartbeating. The purpose of Active-Active Heartbeating is help avoid Split-Brain. Active-Active heartbeating has each controller maintain a 5 second heartbeat response history cache of each network for all monitored hosts as well as the on-going health of storage-0 if provisioned and enabled. This is referred to as the 'heartbeat cluster history' Each controller then includes its cluster history in each heartbeat pulse request message. The hbsClient, now modified to handle heartbeat from both controllers, saves each controllers' heartbeat cluster history in a local cache and criss-crosses the data in its pulse responses. So when the hbsClient receives a pulse request from controller-0 it saves its reported history and then replaces that history information in its response to controller-0 with what it saved from controller-1's last pulse request ; i.e. its view of the system. Controller-0, receiving a host's pulse response, saves its peers heartbeat cluster history so that it has summary of heartbeat cluster history for the last 5 seconds for each monitored network of every monitored host in the system from both controllers' perspectives. Same for controller-1 with controller-0's history. The hbsAgent is then further enhanced to support a query request for this information. So now SM, when it needs to make a decision to avoid Split-Brain or otherwise, can query either controller for its heartbeat cluster history and get the last 5 second summary view of heartbeat (network) responsivness from both controllers perspectives to help decide which controller to make active. This involved removing the hbsAgent process from SM control and monitor and adding a new hbsAgent LSB init script for process launch, service file to run the init script and pmon config file for hbsAgent process monitoring. With hbsAgent now running on both controllers, changes to maintenance were required to send inventory to hbsAgent on both controllers, listen for hbsAgent event messages over the management interface and inform both hbsAgents which controller is active. The hbsAgent running on the inactive controller does not - does not send heartbeat events to maintenance - does not send raise or clear alarms or produce customer logs Test Plan: Feature: PASS: Verify hbsAgent runs on both controllers PASS: Verify hbsAgent as pmon monitored process (not SM) PASS: Verify system install and cluster collection in all system types (10+) PASS: Verify active controller hbsAgent detects and handles heartbeat loss PASS: Verify inactive controller hbsAgent detects and logs heartbeat loss PASS: Verify heartbeat cluster history collection functions properly. PASS: Verify storage-0 state tracking in cluster into. PASS: Verify storage-0 not responding handling PASS: Verify heartbeat response is sent back to only the requesting controller. PASS: Verify heartbeat history is correct from each controller PASS: Verify MNFA from active controller after install to controller-0 PASS: Verify MNFA from active controller after swact to controller-1 PASS: Verify MNFA for 80%+ of the hosts in the storage system PASS: Verify SM cluster query operation and content from both controllers PASS: Verify restart of inactive hbsAgent doesn't clear existing heartbeat alarms Logging: PASS: Verify cluster info logs. PASS: Verify feature design logging. PASS: Verify hbsAgent and hbsClient design logs on all hosts add value PASS: Verify design logging from both controllers in heartbeat loss case PASS: Verify design logging from both controllers in MNFA case PASS: Verify clog logs cluster info vault status and updates for controllers PASS: Verify clog1 logs full cluster state change for all hosts PASS: Verify clog2 logs cluster info save/append logs for controllers PASS: Verify clog3 memory dumps a cluster history PASS: Verify USR2 forces heartbeat and cluster info log dump PASS: Verify hourly heartbeat and cluster info log dump PASS: Verify loss events force heartbeat and cluster info log dump Regression: PASS: Verify Large System DOR PASS: Verify pmond regression test that now includes hbsAgent PASS: Verify Lock/Unlock of inactive controller (x3) PASS: Verify Swact behavior (x10) PASS: Verify compute Lock/Unlock PASS: Verify storage-0 Lock/Unlock PASS: Verify compute Host Failure and Graceful Recovery PASS: Verify Graceful Recovery Retry to Max:3 then Full Enable PASS: Verify Delete Host PASS: Verify Patching hbsAgent and hbsClient PASS: Verify event driven cluster push Story: 2003576 Task: 24907 Change-Id: I5baf5bcca23601a99473d039356d58250ffb01b5 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2018-11-20 19:57:18 +00:00
haochuan	18624ceeb1	Fix resource leak issue, memory not free Partial-Bug: 1794903 Change-Id: I5d12f9bec2f089674fc601e4cd297f72daefa6f8 Signed-off-by: Martin Chen <haochuan.z.chen@intel.com>	2018-10-19 04:37:54 +08:00
Jim Gauld	6a5e10492c	Decouple Guest-server/agent from stx-metal This decouples the build and packaging of guest-server, guest-agent from mtce, by splitting guest component into stx-nfv repo. This leaves existing C++ code, scripts, and resource files untouched, so there is no functional change. Code refactoring is beyond the scope of this update. Makefiles were modified to include devel headers directories /usr/include/mtce-common and /usr/include/mtce-daemon. This ensures there is no contamination with other system headers. The cgts-mtce-common package is renamed and split into: - repo stx-metal: mtce-common, mtce-common-dev - repo stx-metal: mtce - repo stx-nfv: mtce-guest - repo stx-ha: updates package dependencies to mtce-pmon for service-mgmt, sm, and sm-api mtce-common: - contains common and daemon shared source utility code mtce-common-dev: - based on mtce-common, contains devel package required to build mtce-guest and mtce - contains common library archives and headers mtce: - contains components: alarm, fsmon, fsync, heartbeat, hostw, hwmon, maintenance, mtclog, pmon, public, rmon mtce-guest: - contains guest component guest-server, guest-agent Story: 2002829 Task: 22748 Change-Id: I9c7a9b846fd69fd566b31aa3f12a043c08f19f1f Signed-off-by: Jim Gauld <james.gauld@windriver.com>	2018-09-18 17:15:08 -04:00

24 Commits