metal

Commit Graph

Author	SHA1	Message	Date
Andy Ning	1f507e0e62	Add ipsec auth server pmon configuration This update added ipsec auth server pmon configuration file in mtce-control package. The pmon configuration file is only needed on controller node, as ipsec-server is running on controllers only. Test Plan: PASS: In a deployed system, verify ipsec-server is running PASS: kill the ipsec-server process, verify that it is started by pmon. Story: 2010940 Task: 49484 Co-Authored-By: Andy Ning <andy.ning@windriver.com> Change-Id: Iadb9ca6f086640d008880a21cfd97256b00ab7ab Signed-off-by: Leonardo Mendes <Leonardo.MendesSantana@windriver.com>	2024-02-09 16:05:18 -03:00
Davi Frossard	dd0eb34208	Remove qemu dependency from mtce-compute and mtce-control Dependency is necessary only on centos packaging. Test Plan: PASS - Build packages. PASS - Build/install image on AIO-SX. Depends-On: https://review.opendev.org/c/starlingx/virt/+/885342 Story: 2010781 Task: 48183 Change-Id: I5a6e4a7ba12c83372dd3171e054bf612c1484f7e Signed-off-by: Davi Frossard <dbarrosf@windriver.com>	2023-12-04 14:19:28 +00:00
Al Bailey	37c5910a62	Update mtce debian package ver based on git Update debian package versions to use git commits for: - mtce (old 9, new 30) - mtce-common (old 1, new 9) - mtce-compute (old 3, new 4) - mtce-control (old 7, new 10) - mtce-storage (old 3, new 4) The Debian packaging has been changed to reflect all the git commits under the directory, and not just the commits to the metadata folder. This ensures that any new code submissions under those directories will increment the versions. Test Plan: PASS: build-pkgs -p mtce PASS: build-pkgs -p mtce-common PASS: build-pkgs -p mtce-compute PASS: build-pkgs -p mtce-control PASS: build-pkgs -p mtce-storage Story: 2010550 Task: 47401 Task: 47402 Task: 47403 Task: 47404 Task: 47405 Signed-off-by: Al Bailey <al.bailey@windriver.com> Change-Id: I4846804320b0ad3ec10799a468a9ee3bf7973587	2023-03-02 14:50:35 +00:00
Zuul	6bcd8333b2	Merge "Debian: Remove conf files from etc-pmon.d"	2022-09-30 19:41:16 +00:00
Charles Short	ab7d062a01	debian: Remove package preset install for metal Remove the installation of per-package preset installs since they are centrally managed now by the ISO install for the following packages: - mtce-compute - mtce-control - mtce-storage Story: 2009968 Task: 46406 Test Plan PASS Build package PASS Build ISO PASS Check for non-existant preset file in /etc/systemd/system-preset Depends-On: https://review.opendev.org/c/starlingx/integ/+/853653 Signed-off-by: Charles Short <charles.short@windriver.com> Change-Id: Ica1a99efe2336fdb6096086f46189dfd25efc6e1	2022-09-27 08:23:09 +00:00
Leonardo Fagundes Luz Serrano	d1c0d04719	Debian: Remove conf files from etc-pmon.d Removed conf files from /etc/pmon.d/ as they are being moved to another location. This is part of an effort to allow pmon conf files to be selected at runtime by kickstarts. The change is debian-only, since centos support will be dropped soon. Centos' pmon conf files remain in /etc/pmon.d/ Test Plan: PASS - deb doesn't install anything to /etc/pmon.d/ PASS - rpm files unchanged PASS - AIOSX unlocked-enabled-available PASS - Standard 2+2 unlocked-enabled-available Story: 2010211 Task: 46306 Depends-On: https://review.opendev.org/c/starlingx/metal/+/855095 Signed-off-by: Leonardo Fagundes Luz Serrano <Leonardo.FagundesLuzSerrano@windriver.com> Change-Id: I086db0750df5626d2a8ba1010153ce4f45535ca5	2022-09-26 13:41:40 +00:00
Leonardo Fagundes Luz Serrano	a5e7a108f5	Duplicate pmon.d conf files to another location Created a duplicate install of /etc/pmon.d/*.conf files to /usr/share/starlingx/pmon.d/ This is part of an effort to allow pmon conf files to be selected at runtime by kickstarter. Test Plan: PASS: duplicate conf on deb Story: 2010211 Task: 46112 Signed-off-by: Leonardo Fagundes Luz Serrano <Leonardo.FagundesLuzSerrano@windriver.com> Change-Id: Ie07c1bfa370da5b2ec71fe3fce948d59be1dd098	2022-08-26 16:21:18 -03:00
Chuck Short	3cdebf7c62	debian: Simplify mtce-control packaging - Ensure that the service is started when the package is installed. - Ensure that the service dependencies are started when the package is installed. - Simplify debian/rules to use the Makefile in order to install the files that are needed. Test Plan PASS Build package and ISO PASS Boot and check for goenabled-control.service Story: 2009101 Task: 43023 Signed-off-by: Chuck Short <charles.short@windriver.com> Change-Id: I3863042357257ffbcfaf8084da2f44853e0b6264	2022-03-10 19:02:35 +00:00
Matheus Machado Guilhermino	4c8abe18d3	Fix failing mtce services on Debian Modified mtce and mtce-control to address the following failing services on Debian: hbsAgent.service hbsClient.service hwmon.service lmon.service mtcalarm.service mtclog.service runservices.service Applied fix: - Included modified .service files for debian directly into into the deb_folder. - Changed the init files to account for the different locations of the init-functions and service daemons on Debian and CentOS - Included "override_dh_installsystemd" section to rules in order to start services at boot. Test Plan: PASS: Package installed and ISO built successfully PASS: Ran "systemctl list-units --failed" and verified that the services are not failing PASS: Ran "systemctl status <service_name>" for each service and verified that they are active Story: 2009101 Task: 44192 Signed-off-by: Matheus Machado Guilhermino <Matheus.MachadoGuilhermino@windriver.com> Change-Id: I50915c17d6f50f5e20e6448d3e75bfe54a75acc0	2022-01-14 10:50:09 -03:00
Tracey Bogue	0551c665cb	Add Debian packaging for mtce packages Some of the code used TRUE instead of true which did not compile for Debian. These instances were changed to true. Some #define constants generated narrowing errors because their values are negative in a 32 bit integer. These values were explicitly casted to int in the case statements causing the errors. Story: 2009101 Task: 43426 Signed-off-by: Tracey Bogue <tracey.bogue@windriver.com> Change-Id: Iffc4305660779010969e0c506d4ef46e1ebc2c71	2021-10-29 09:17:00 -05:00
Eric MacDonald	5ab03b5222	Mtce heartbeat cluster state change notification improvement The current heartbeat cluster state change notification needs to be sent when heartbeat pulses begin to be missed rather than only after the host has reached the Heartbeat Loss threshold. This buys SM more time, almost a full second, and in doing so provides more accurate data for it to make its SM heartbeat failure handling decisions. This update also begins sending maintenance heartbeat cluster state change notifications just before the next multicast pulse request but after the cluster vault is updated from the last pulse period. This ensures that SM gets the most up-to-date cluster information. This update also changes the hbsAgent's service file to depend on the local hbsClient. By doing so, the hbsAgent shuts down earlier over a graceful reboot thereby preventing the hbsAgent from continuing to report healthy response to the inactive controller during active controller shutdown. This way the inactive SM sees the failed active controller when it queries the cluster in its fail-pending state resulting in an inactive SM take-over rather than stand-down. Additional hbsAgent service file changes were made to prevent systemd from auto recovering a failed hbsAgent process, as its monitored and managed by pmond, and fixed the ExecStop command line. Test Plan: PASS: Verify active controller graceful reboot. Standby controller takes over rather than shutdown - 30 of 30 iterations PASS: Verify active controller forced reboot PASS: Verify enabled standby controller graceful reboot PASS: Verify Standard System install PASS: Verify AIO DX system install Regression: PASS: Verify SM Uncontrolled Swact if active controller Mgmnt link drops. PASS: Verify handling of downed cluster interface in - AIO DX (fail) and Standard (degrade) system PASS: Verify no coredumps PASS: Verify update as a patch Change-Id: I6869631e091eb28a3cbb6f15d9a8ccd939c54410 Closes-Bug: 1906556 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2021-01-08 09:59:24 -05:00
Eric MacDonald	55d5f43edb	Fix heartbeat messaging when interface is set to 'lo' Maintenance heartbeat service should not be multicast messaging over an 'lo' interface which in IPv6 leads to socket failures, log flooding and the inability to detect and report pmond process failure. To fix that this update - configures pulse messaging to unicast for monitored networks configured as 'lo'. - prevents heartbeating over the cluster network if both it and the management network are both configured on the 'lo' interface. - improves logging to avoid flooding in the presence of socket setup or access errors. - stops logging netlink events (interface state changes) on unmonitored network interfaces. - maintains heartbeat disabled state until the management network is up. - modifies hbsAgent socket failure handling and its pmon conf file so that a persistent socket failure during startup is alarmed as an hbsAgent process failure. Test Plan: PASS: Verify logging over system install and socket errors PASS: Verify unicast messaging when cluster is set to 'lo' PASS: Verify no cluster network heartbeat when it and mgmnt are set to 'lo'. Regression: PASS: Verify heartbeat messaging and cluster info PASS: Verify pmond process failure alarm management PASS: Verify heartbeat failure detection and graceful recovery PASS: Verify AIO SX IPv6 system install and run PASS: Verify AIO DX IPv6 system install and run PASS: Verify Standard IPv6 system install and run PASS: Verify Storage system IPv6 install and run PASS: Verify Storage system IPv4 install and run PASS: Verify MNFA handling in IPv6 storage system Change-Id: I5a2a0b2dee0c690617c4e0b0e2ab8b1172b2dc49 Closes-Bug: 1884585 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2020-06-26 14:16:41 +00:00
Eric MacDonald	7d8be4bc1f	Add auto-versioning to starlingx/metal mtce packages This update makes use of the PKG_GITREVCOUNT variable to auto-version the mtce packages in this repo. Change-Id: Ifb4da4570e0261bbdcf0d7af79b8add7cfc133ac Story: 2006166 Task: 39822 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2020-05-21 15:18:43 -04:00
Sharath Kumar K	b725a0974b	De-branding in starlingx/metal: Titanium Cloud -> StarlingX 1. Rename Titanium Cloud to StarlingX for .spec files 2. Rename Titanium Cloud to StarlingX for .service file Test: After the de-brand change, bootimage.iso has built in the flock layer and installed on the dev machine to validate the changes. Please note, doing de-brand changes in batches, this is batch1 changes. Story: 2006387 Task: 36207 Change-Id: Ifa4dc5c7aa3189815e00b796fc833852e88c8fe3 Signed-off-by: Sharath Kumar K <sharath.kumar@intel.com>	2020-04-03 07:58:25 +02:00
Marcela Rosales	b5f12793a1	Update openSUSE OBS artifacts to build MTCE packages The openSUSE spec files needs to have the path of the source code in the setup to have the package generation automated through _service file in OBS. Change-Id: I2b7c08d5772025c02821dfb9fc944fff0f5b6f90 Story: 2006508 Task: 36812 Signed-off-by: Marcela Rosales <marcela.a.rosales.jimenez@intel.com>	2019-10-01 11:07:10 -05:00
Marcela Rosales	a0a3693bc4	Add openSUSE OBS Artifacts for Maintenance services StarlingX Open Build Service [0] builds MTCE packages using base artifacts: - Spec file - Changelog [0] https://build.opensuse.org/project/show/Cloud:StarlingX:2.0 Story: 2006508 Task: 36556 Task: 36557 Task: 36558 Task: 36559 Task: 36560 Task: 36561 Change-Id: I9bf59ab4b890ebe33a9304d3f886951c860412a6 Signed-off-by: Marcela Rosales <marcela.a.rosales.jimenez@intel.com>	2019-09-20 09:18:54 -05:00
Hayde Martinez	8f5f67334f	SUSE Specfile for mtce-control Init Script LSB It is required for the goenabled and hbsAgent scripts headers to be compliant with LSB in order to build on OBS infrastructure. Story: 2005684 Task: 33442 Change-Id: Ic1ad5722b725c04d91f1650065faca3dc7b5c2c9 Signed-off-by: Hayde Martinez <hayde.martinez.landa@intel.com>	2019-05-28 15:17:24 +00:00
Eric MacDonald	0b922227ac	Implement Active-Active Heartbeat as HA Improvement This update introduces mtce changes to support Active-Active Heartbeating. The purpose of Active-Active Heartbeating is help avoid Split-Brain. Active-Active heartbeating has each controller maintain a 5 second heartbeat response history cache of each network for all monitored hosts as well as the on-going health of storage-0 if provisioned and enabled. This is referred to as the 'heartbeat cluster history' Each controller then includes its cluster history in each heartbeat pulse request message. The hbsClient, now modified to handle heartbeat from both controllers, saves each controllers' heartbeat cluster history in a local cache and criss-crosses the data in its pulse responses. So when the hbsClient receives a pulse request from controller-0 it saves its reported history and then replaces that history information in its response to controller-0 with what it saved from controller-1's last pulse request ; i.e. its view of the system. Controller-0, receiving a host's pulse response, saves its peers heartbeat cluster history so that it has summary of heartbeat cluster history for the last 5 seconds for each monitored network of every monitored host in the system from both controllers' perspectives. Same for controller-1 with controller-0's history. The hbsAgent is then further enhanced to support a query request for this information. So now SM, when it needs to make a decision to avoid Split-Brain or otherwise, can query either controller for its heartbeat cluster history and get the last 5 second summary view of heartbeat (network) responsivness from both controllers perspectives to help decide which controller to make active. This involved removing the hbsAgent process from SM control and monitor and adding a new hbsAgent LSB init script for process launch, service file to run the init script and pmon config file for hbsAgent process monitoring. With hbsAgent now running on both controllers, changes to maintenance were required to send inventory to hbsAgent on both controllers, listen for hbsAgent event messages over the management interface and inform both hbsAgents which controller is active. The hbsAgent running on the inactive controller does not - does not send heartbeat events to maintenance - does not send raise or clear alarms or produce customer logs Test Plan: Feature: PASS: Verify hbsAgent runs on both controllers PASS: Verify hbsAgent as pmon monitored process (not SM) PASS: Verify system install and cluster collection in all system types (10+) PASS: Verify active controller hbsAgent detects and handles heartbeat loss PASS: Verify inactive controller hbsAgent detects and logs heartbeat loss PASS: Verify heartbeat cluster history collection functions properly. PASS: Verify storage-0 state tracking in cluster into. PASS: Verify storage-0 not responding handling PASS: Verify heartbeat response is sent back to only the requesting controller. PASS: Verify heartbeat history is correct from each controller PASS: Verify MNFA from active controller after install to controller-0 PASS: Verify MNFA from active controller after swact to controller-1 PASS: Verify MNFA for 80%+ of the hosts in the storage system PASS: Verify SM cluster query operation and content from both controllers PASS: Verify restart of inactive hbsAgent doesn't clear existing heartbeat alarms Logging: PASS: Verify cluster info logs. PASS: Verify feature design logging. PASS: Verify hbsAgent and hbsClient design logs on all hosts add value PASS: Verify design logging from both controllers in heartbeat loss case PASS: Verify design logging from both controllers in MNFA case PASS: Verify clog logs cluster info vault status and updates for controllers PASS: Verify clog1 logs full cluster state change for all hosts PASS: Verify clog2 logs cluster info save/append logs for controllers PASS: Verify clog3 memory dumps a cluster history PASS: Verify USR2 forces heartbeat and cluster info log dump PASS: Verify hourly heartbeat and cluster info log dump PASS: Verify loss events force heartbeat and cluster info log dump Regression: PASS: Verify Large System DOR PASS: Verify pmond regression test that now includes hbsAgent PASS: Verify Lock/Unlock of inactive controller (x3) PASS: Verify Swact behavior (x10) PASS: Verify compute Lock/Unlock PASS: Verify storage-0 Lock/Unlock PASS: Verify compute Host Failure and Graceful Recovery PASS: Verify Graceful Recovery Retry to Max:3 then Full Enable PASS: Verify Delete Host PASS: Verify Patching hbsAgent and hbsClient PASS: Verify event driven cluster push Story: 2003576 Task: 24907 Change-Id: I5baf5bcca23601a99473d039356d58250ffb01b5 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2018-11-20 19:57:18 +00:00
Yong Hu	2402fb16ae	get rid of duplicate LICENSE files in 3 packages There are 2 duplicated LICESE files in mtce-control, mtce-compute, and mtce-storage. Additionally, LICENSE was not placed in the root directory of src RPM, so this patch is made as an enhancement or fix. After this change, license file location and code structure in all 4 modules (mtce-common, mtce-compute, mtce-storage and mtce-control) will be the same. Test method: make a clean build and check src RPM and binary RPM to assure there is only one LICENSE in correct place. Story: 2004186 Task: 27676 Change-Id: Id71a7450e8b45438c5d15976ae8e853b9ba8f4f5 Signed-off-by: Yong Hu <yong.hu@intel.com>	2018-10-30 02:55:34 +00:00
Yong Hu	718efbcf0d	remove cgts- prefix to align with other sub-projects (packages) Rename files and folders in mtce-compute, mtce-control, and mtce-storage. As well update packages' names in bsp-files/ filter_out_* scripts accordingly. Story: 2004079 Task: 27485 Change-Id: Ic1e9bd4bb8d72f30ddcc2a2bfc602a1a34e583da Signed-off-by: Yong Hu <yong.hu@intel.com>	2018-10-19 06:07:31 +00:00
Scott Little	89dd36625e	Rename mwa-* subdirectories to match the git repo name mwa-delphia -> stx-clients mwa-pitta -> stx-config mwa-cleo -> stx-fault mwa-gplv2 -> stx-gplv2 mwa-gplv3 -> stx-gplv3 mwa-solon -> stx-ha mwa-sparta -> stx-integ mwa-beas -> stx-metal mwa-thales -> stx-nfv mwa-chilon -> stx-update mwa-perian -> stx-upstream Depends-On: https://review.openstack.org/579954 Depends-On: https://review.openstack.org/579957 Change-Id: I269a4e79425a41709381f8894456d21233463e9f Signed-off-by: Scott Little <scott.little@windriver.com>	2018-07-03 16:29:24 -04:00
Dean Troyer	18922761a6	StarlingX open source release updates Signed-off-by: Dean Troyer <dtroyer@gmail.com>	2018-05-31 07:36:43 -07:00

22 Commits