Commit Graph

22 Commits

Author SHA1 Message Date
Andy Ning 1f507e0e62 Add ipsec auth server pmon configuration
This update added ipsec auth server pmon configuration file in
mtce-control package. The pmon configuration file is only needed
on controller node, as ipsec-server is running on controllers only.

Test Plan:
PASS: In a deployed system, verify ipsec-server is running
PASS: kill the ipsec-server process, verify that it is started
      by pmon.

Story: 2010940
Task: 49484

Co-Authored-By: Andy Ning <andy.ning@windriver.com>

Change-Id: Iadb9ca6f086640d008880a21cfd97256b00ab7ab
Signed-off-by: Leonardo Mendes <Leonardo.MendesSantana@windriver.com>
2024-02-09 16:05:18 -03:00
Davi Frossard dd0eb34208 Remove qemu dependency from mtce-compute and mtce-control
Dependency is necessary only on centos packaging.

Test Plan:
PASS - Build packages.
PASS - Build/install image on AIO-SX.

Depends-On: https://review.opendev.org/c/starlingx/virt/+/885342

Story: 2010781
Task: 48183

Change-Id: I5a6e4a7ba12c83372dd3171e054bf612c1484f7e
Signed-off-by: Davi Frossard <dbarrosf@windriver.com>
2023-12-04 14:19:28 +00:00
Al Bailey 37c5910a62 Update mtce debian package ver based on git
Update debian package versions to use git commits for:
 - mtce         (old 9, new 30)
 - mtce-common  (old 1, new 9)
 - mtce-compute (old 3, new 4)
 - mtce-control (old 7, new 10)
 - mtce-storage (old 3, new 4)

The Debian packaging has been changed to reflect all the
git commits under the directory, and not just the commits
to the metadata folder.

This ensures that any new code submissions under those
directories will increment the versions.

Test Plan:
  PASS: build-pkgs -p mtce
  PASS: build-pkgs -p mtce-common
  PASS: build-pkgs -p mtce-compute
  PASS: build-pkgs -p mtce-control
  PASS: build-pkgs -p mtce-storage

Story: 2010550
Task: 47401
Task: 47402
Task: 47403
Task: 47404
Task: 47405

Signed-off-by: Al Bailey <al.bailey@windriver.com>
Change-Id: I4846804320b0ad3ec10799a468a9ee3bf7973587
2023-03-02 14:50:35 +00:00
Zuul 6bcd8333b2 Merge "Debian: Remove conf files from etc-pmon.d" 2022-09-30 19:41:16 +00:00
Charles Short ab7d062a01 debian: Remove package preset install for metal
Remove the installation of per-package preset installs
since they are centrally managed now by the ISO install
for the following packages:

- mtce-compute
- mtce-control
- mtce-storage

Story: 2009968
Task: 46406

Test Plan

PASS Build package
PASS Build ISO
PASS Check for non-existant preset file in /etc/systemd/system-preset

Depends-On: https://review.opendev.org/c/starlingx/integ/+/853653

Signed-off-by: Charles Short <charles.short@windriver.com>
Change-Id: Ica1a99efe2336fdb6096086f46189dfd25efc6e1
2022-09-27 08:23:09 +00:00
Leonardo Fagundes Luz Serrano d1c0d04719 Debian: Remove conf files from etc-pmon.d
Removed conf files from /etc/pmon.d/
as they are being moved to another location.

This is part of an effort to allow pmon conf files
to be selected at runtime by kickstarts.

The change is debian-only, since centos support
will be dropped soon.
Centos' pmon conf files remain in /etc/pmon.d/

Test Plan:
PASS - deb doesn't install anything to /etc/pmon.d/
PASS - rpm files unchanged
PASS - AIOSX unlocked-enabled-available
PASS - Standard 2+2 unlocked-enabled-available

Story: 2010211
Task: 46306

Depends-On: https://review.opendev.org/c/starlingx/metal/+/855095

Signed-off-by: Leonardo Fagundes Luz Serrano <Leonardo.FagundesLuzSerrano@windriver.com>
Change-Id: I086db0750df5626d2a8ba1010153ce4f45535ca5
2022-09-26 13:41:40 +00:00
Leonardo Fagundes Luz Serrano a5e7a108f5 Duplicate pmon.d conf files to another location
Created a duplicate install of /etc/pmon.d/*.conf files
to /usr/share/starlingx/pmon.d/

This is part of an effort to allow pmon conf files
to be selected at runtime by kickstarter.

Test Plan:
PASS: duplicate conf on deb

Story: 2010211
Task: 46112

Signed-off-by: Leonardo Fagundes Luz Serrano <Leonardo.FagundesLuzSerrano@windriver.com>
Change-Id: Ie07c1bfa370da5b2ec71fe3fce948d59be1dd098
2022-08-26 16:21:18 -03:00
Chuck Short 3cdebf7c62 debian: Simplify mtce-control packaging
- Ensure that the service is started when the package
  is installed.
- Ensure that the service dependencies are started
  when the package is installed.
- Simplify debian/rules to use the Makefile in order
  to install the files that are needed.

Test Plan
PASS Build package and ISO
PASS Boot and check for goenabled-control.service

Story: 2009101
Task: 43023

Signed-off-by: Chuck Short <charles.short@windriver.com>
Change-Id: I3863042357257ffbcfaf8084da2f44853e0b6264
2022-03-10 19:02:35 +00:00
Matheus Machado Guilhermino 4c8abe18d3 Fix failing mtce services on Debian
Modified mtce and mtce-control to address the following
failing services on Debian:
hbsAgent.service
hbsClient.service
hwmon.service
lmon.service
mtcalarm.service
mtclog.service
runservices.service

Applied fix:
- Included modified .service files for debian
directly into into the deb_folder.
- Changed the init files to account for the different
locations of the init-functions and service daemons
on Debian and CentOS
- Included "override_dh_installsystemd" section
to rules in order to start services at boot.

Test Plan:

PASS: Package installed and ISO built successfully
PASS: Ran "systemctl list-units --failed" and verified that the
services are not failing
PASS: Ran "systemctl status <service_name>" for
each service and verified that they are active

Story: 2009101
Task: 44192

Signed-off-by: Matheus Machado Guilhermino <Matheus.MachadoGuilhermino@windriver.com>
Change-Id: I50915c17d6f50f5e20e6448d3e75bfe54a75acc0
2022-01-14 10:50:09 -03:00
Tracey Bogue 0551c665cb Add Debian packaging for mtce packages
Some of the code used TRUE instead of true which did not compile
for Debian. These instances were changed to true.
Some #define constants generated narrowing errors because their
values are negative in a 32 bit integer. These values were
explicitly casted to int in the case statements causing the errors.

Story: 2009101
Task: 43426

Signed-off-by: Tracey Bogue <tracey.bogue@windriver.com>
Change-Id: Iffc4305660779010969e0c506d4ef46e1ebc2c71
2021-10-29 09:17:00 -05:00
Eric MacDonald 5ab03b5222 Mtce heartbeat cluster state change notification improvement
The current heartbeat cluster state change notification
needs to be sent when heartbeat pulses begin to be missed
rather than only after the host has reached the Heartbeat
Loss threshold. This buys SM more time, almost a full
second, and in doing so provides more accurate data for
it to make its SM heartbeat failure handling decisions.

This update also begins sending maintenance heartbeat
cluster state change notifications just before the next
multicast pulse request but after the cluster vault is
updated from the last pulse period. This ensures that
SM gets the most up-to-date cluster information.

This update also changes the hbsAgent's service file
to depend on the local hbsClient. By doing so, the
hbsAgent shuts down earlier over a graceful reboot
thereby preventing the hbsAgent from continuing to
report healthy response to the inactive controller
during active controller shutdown.

This way the inactive SM sees the failed active
controller when it queries the cluster in its
fail-pending state resulting in an inactive SM
take-over rather than stand-down.

Additional hbsAgent service file changes were made to
prevent systemd from auto recovering a failed hbsAgent
process, as its monitored and managed by pmond, and
fixed the ExecStop command line.

Test Plan:

PASS: Verify active controller graceful reboot.
      Standby controller takes over rather than shutdown
      - 30 of 30 iterations
PASS: Verify active controller forced reboot
PASS: Verify enabled standby controller graceful reboot
PASS: Verify Standard System install
PASS: Verify AIO DX system install

Regression:

PASS: Verify SM Uncontrolled Swact if active
      controller Mgmnt link drops.
PASS: Verify handling of downed cluster interface in
      - AIO DX (fail) and Standard (degrade) system
PASS: Verify no coredumps
PASS: Verify update as a patch

Change-Id: I6869631e091eb28a3cbb6f15d9a8ccd939c54410
Closes-Bug: 1906556
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2021-01-08 09:59:24 -05:00
Eric MacDonald 55d5f43edb Fix heartbeat messaging when interface is set to 'lo'
Maintenance heartbeat service should not be multicast
messaging over an 'lo' interface which in IPv6 leads
to socket failures, log flooding and the inability to
detect and report pmond process failure.

To fix that this update
 - configures pulse messaging to unicast for monitored
   networks configured as 'lo'.
 - prevents heartbeating over the cluster network if both
   it and the management network are both configured on
   the 'lo' interface.
 - improves logging to avoid flooding in the presence of
   socket setup or access errors.
 - stops logging netlink events (interface state changes)
   on unmonitored network interfaces.
 - maintains heartbeat disabled state until the management
   network is up.
 - modifies hbsAgent socket failure handling and its pmon
   conf file so that a persistent socket failure during
   startup is alarmed as an hbsAgent process failure.

Test Plan:

PASS: Verify logging over system install and socket errors
PASS: Verify unicast messaging when cluster is set to 'lo'
PASS: Verify no cluster network heartbeat when it and mgmnt
      are set to 'lo'.

Regression:

PASS: Verify heartbeat messaging and cluster info
PASS: Verify pmond process failure alarm management
PASS: Verify heartbeat failure detection and graceful recovery
PASS: Verify AIO SX IPv6 system install and run
PASS: Verify AIO DX IPv6 system install and run
PASS: Verify Standard IPv6 system install and run
PASS: Verify Storage system IPv6 install and run
PASS: Verify Storage system IPv4 install and run
PASS: Verify MNFA handling in IPv6 storage system

Change-Id: I5a2a0b2dee0c690617c4e0b0e2ab8b1172b2dc49
Closes-Bug: 1884585
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2020-06-26 14:16:41 +00:00
Eric MacDonald 7d8be4bc1f Add auto-versioning to starlingx/metal mtce packages
This update makes use of the PKG_GITREVCOUNT variable
to auto-version the mtce packages in this repo.

Change-Id: Ifb4da4570e0261bbdcf0d7af79b8add7cfc133ac
Story: 2006166
Task: 39822
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2020-05-21 15:18:43 -04:00
Sharath Kumar K b725a0974b De-branding in starlingx/metal: Titanium Cloud -> StarlingX
1. Rename Titanium Cloud to StarlingX for .spec files
2. Rename Titanium Cloud to StarlingX for .service file

Test:
After the de-brand change, bootimage.iso has built in the flock layer
and installed on the dev machine to validate the changes.

Please note, doing de-brand changes in batches, this is batch1 changes.

Story: 2006387
Task: 36207

Change-Id: Ifa4dc5c7aa3189815e00b796fc833852e88c8fe3
Signed-off-by: Sharath Kumar K <sharath.kumar@intel.com>
2020-04-03 07:58:25 +02:00
Marcela Rosales b5f12793a1 Update openSUSE OBS artifacts to build MTCE packages
The openSUSE spec files needs to have the path of the source code in
the setup to have the package generation automated through _service
file in OBS.

Change-Id: I2b7c08d5772025c02821dfb9fc944fff0f5b6f90
Story: 2006508
Task: 36812
Signed-off-by: Marcela Rosales <marcela.a.rosales.jimenez@intel.com>
2019-10-01 11:07:10 -05:00
Marcela Rosales a0a3693bc4 Add openSUSE OBS Artifacts for Maintenance services
StarlingX Open Build Service [0] builds MTCE packages using base
artifacts:
- Spec file
- Changelog

[0] https://build.opensuse.org/project/show/Cloud:StarlingX:2.0

Story: 2006508
Task: 36556
Task: 36557
Task: 36558
Task: 36559
Task: 36560
Task: 36561

Change-Id: I9bf59ab4b890ebe33a9304d3f886951c860412a6
Signed-off-by: Marcela Rosales <marcela.a.rosales.jimenez@intel.com>
2019-09-20 09:18:54 -05:00
Hayde Martinez 8f5f67334f SUSE Specfile for mtce-control Init Script LSB
It is required for the goenabled and hbsAgent scripts headers to be
compliant with LSB in order to build on OBS infrastructure.

Story: 2005684
Task: 33442

Change-Id: Ic1ad5722b725c04d91f1650065faca3dc7b5c2c9
Signed-off-by: Hayde Martinez <hayde.martinez.landa@intel.com>
2019-05-28 15:17:24 +00:00
Eric MacDonald 0b922227ac Implement Active-Active Heartbeat as HA Improvement
This update introduces mtce changes to support Active-Active Heartbeating.

The purpose of Active-Active Heartbeating is help avoid Split-Brain.

Active-Active heartbeating has each controller maintain a 5 second
heartbeat response history cache of each network for all monitored
hosts as well as the on-going health of storage-0 if provisioned and
enabled.

This is referred to as the 'heartbeat cluster history'

Each controller then includes its cluster history in each heartbeat
pulse request message.

The hbsClient, now modified to handle heartbeat from both controllers,
saves each controllers' heartbeat cluster history in a local cache and
criss-crosses the data in its pulse responses.

So when the hbsClient receives a pulse request from controller-0 it
saves its reported history and then replaces that history information
in its response to controller-0 with what it saved from controller-1's
last pulse request ; i.e. its view of the system.

Controller-0, receiving a host's pulse response, saves its peers
heartbeat cluster history so that it has summary of heartbeat
cluster history for the last 5 seconds for each monitored network
of every monitored host in the system from both controllers'
perspectives. Same for controller-1 with controller-0's history.

The hbsAgent is then further enhanced to support a query request
for this information.

So now SM, when it needs to make a decision to avoid Split-Brain
or otherwise, can query either controller for its heartbeat cluster
history and get the last 5 second summary view of heartbeat (network)
responsivness from both controllers perspectives to help decide which
controller to make active.

This involved removing the hbsAgent process from SM control and monitor
and adding a new hbsAgent LSB init script for process launch, service
file to run the init script and pmon config file for hbsAgent process
monitoring.

With hbsAgent now running on both controllers, changes to maintenance
were required to send inventory to hbsAgent on both controllers,
listen for hbsAgent event messages over the management interface
and inform both hbsAgents which controller is active.

The hbsAgent running on the inactive controller does not
 - does not send heartbeat events to maintenance
 - does not send raise or clear alarms or produce customer logs

Test Plan:

Feature:
PASS: Verify hbsAgent runs on both controllers
PASS: Verify hbsAgent as pmon monitored process (not SM)
PASS: Verify system install and cluster collection in all system types (10+)
PASS: Verify active controller hbsAgent detects and handles heartbeat loss
PASS: Verify inactive controller hbsAgent detects and logs heartbeat loss
PASS: Verify heartbeat cluster history collection functions properly.
PASS: Verify storage-0 state tracking in cluster into.
PASS: Verify storage-0 not responding handling
PASS: Verify heartbeat response is sent back to only the requesting controller.
PASS: Verify heartbeat history is correct from each controller
PASS: Verify MNFA from active controller after install to controller-0
PASS: Verify MNFA from active controller after swact to controller-1
PASS: Verify MNFA for 80%+ of the hosts in the storage system
PASS: Verify SM cluster query operation and content from both controllers
PASS: Verify restart of inactive hbsAgent doesn't clear existing heartbeat alarms

Logging:
PASS: Verify cluster info logs.
PASS: Verify feature design logging.
PASS: Verify hbsAgent and hbsClient design logs on all hosts add value
PASS: Verify design logging from both controllers in heartbeat loss case
PASS: Verify design logging from both controllers in MNFA case
PASS: Verify clog  logs cluster info vault status and updates for controllers
PASS: Verify clog1 logs full cluster state change for all hosts
PASS: Verify clog2 logs cluster info save/append logs for controllers
PASS: Verify clog3 memory dumps a cluster history
PASS: Verify USR2 forces heartbeat and cluster info log dump
PASS: Verify hourly heartbeat and cluster info log dump
PASS: Verify loss events force heartbeat and cluster info log dump

Regression:
PASS: Verify Large System DOR
PASS: Verify pmond regression test that now includes hbsAgent
PASS: Verify Lock/Unlock of inactive controller (x3)
PASS: Verify Swact behavior (x10)
PASS: Verify compute Lock/Unlock
PASS: Verify storage-0 Lock/Unlock
PASS: Verify compute Host Failure and Graceful Recovery
PASS: Verify Graceful Recovery Retry to Max:3 then Full Enable
PASS: Verify Delete Host
PASS: Verify Patching hbsAgent and hbsClient
PASS: Verify event driven cluster push

Story: 2003576
Task: 24907

Change-Id: I5baf5bcca23601a99473d039356d58250ffb01b5
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2018-11-20 19:57:18 +00:00
Yong Hu 2402fb16ae get rid of duplicate LICENSE files in 3 packages
There are 2 duplicated LICESE files in mtce-control, mtce-compute,
and mtce-storage. Additionally, LICENSE was not placed in the root
directory of src RPM, so this patch is made as an enhancement or fix.
After this change, license file location and code structure in all 4
modules (mtce-common, mtce-compute, mtce-storage and mtce-control)
will be the same.

Test method: make a clean build and check src RPM and binary RPM
to assure there is only one LICENSE in correct place.

Story: 2004186
Task: 27676

Change-Id: Id71a7450e8b45438c5d15976ae8e853b9ba8f4f5
Signed-off-by: Yong Hu <yong.hu@intel.com>
2018-10-30 02:55:34 +00:00
Yong Hu 718efbcf0d remove cgts- prefix to align with other sub-projects (packages)
Rename files and folders in mtce-compute, mtce-control, and
mtce-storage. As well update packages' names in bsp-files/
filter_out_* scripts accordingly.

Story: 2004079
Task: 27485

Change-Id: Ic1e9bd4bb8d72f30ddcc2a2bfc602a1a34e583da
Signed-off-by: Yong Hu <yong.hu@intel.com>
2018-10-19 06:07:31 +00:00
Scott Little 89dd36625e Rename mwa-* subdirectories to match the git repo name
mwa-delphia -> stx-clients
mwa-pitta   -> stx-config
mwa-cleo    -> stx-fault
mwa-gplv2   -> stx-gplv2
mwa-gplv3   -> stx-gplv3
mwa-solon   -> stx-ha
mwa-sparta  -> stx-integ
mwa-beas    -> stx-metal
mwa-thales  -> stx-nfv
mwa-chilon  -> stx-update
mwa-perian  -> stx-upstream

Depends-On: https://review.openstack.org/579954
Depends-On: https://review.openstack.org/579957
Change-Id: I269a4e79425a41709381f8894456d21233463e9f
Signed-off-by: Scott Little <scott.little@windriver.com>
2018-07-03 16:29:24 -04:00
Dean Troyer 18922761a6 StarlingX open source release updates
Signed-off-by: Dean Troyer <dtroyer@gmail.com>
2018-05-31 07:36:43 -07:00