config

Commit Graph

Author	SHA1	Message	Date
Scott Little	3077d0c656	Relocated some packages to repo 'stx-puppet' List of relocated subdirectories: puppet-manifests puppet-modules-wrs/puppet-dcdbsync puppet-modules-wrs/puppet-dcmanager puppet-modules-wrs/puppet-dcorch puppet-modules-wrs/puppet-fm puppet-modules-wrs/puppet-mtce puppet-modules-wrs/puppet-nfv puppet-modules-wrs/puppet-patching puppet-modules-wrs/puppet-smapi puppet-modules-wrs/puppet-sshd puppet-modules-wrs/puppet-sysinv Story: 2006166 Task: 35687 Depends-On: I665dc7fabbfffc798ad57843eb74dca16e7647a3 Change-Id: Ibc468b9d97d6dbc7ac09652dcd979c0e68a85672 Signed-off-by: Scott Little <scott.little@windriver.com> Depends-On: I00f54876e7872cf0d3e4f5e8f986cb7e3b23c86f Signed-off-by: Scott Little <scott.little@windriver.com>	2019-09-05 16:18:03 -04:00
Robert Church	38abbef079	Rebase Armada to latest master Rebasing Armada to use the latest docker image tag 8a1638098f88d92bf799ef4934abe569789b885e-ubuntu_bionic. Change-Id: Ic48a2e053d0de7dacfd6a07d817947e11dc8d596 Story: 2006347 Task: 36105 Signed-off-by: Robert Church <robert.church@windriver.com>	2019-08-15 16:54:51 -04:00
Zuul	2c04825fc9	Merge "ANSIBLE Bootstrap changes for System Controller"	2019-07-11 17:29:48 +00:00
Tao Liu	bbac058ade	ANSIBLE Bootstrap changes for System Controller This update contains the following misc. config changes to support ansible bootstrap for system controller. Creates deps model for dcmanager and dcorch puppet modules. Creates a system controller postgres run time manifest which is applied upon the creation of initial controller host or replay after the distributed cloud role has been changed. The patch_vault file system is created during the first controller unlocked. And allows the dc role to be modified during bootstrap. Change-Id: Id7b416274b2a854c469bfdca7448bf1ddea639d7 Story: 2004766 Task: 35650 Signed-off-by: Tao Liu <tao.liu@windriver.com>	2019-07-11 12:08:06 -04:00
Zuul	d86899bd7a	Merge "Change the default value of sysinv-api bind host"	2019-07-09 12:44:58 +00:00
Yi Wang	89728b15ec	Change the default value of sysinv-api bind host The default sysinv-api bind host value was changed from "0.0.0.0" to "::" to support both IPV4 and IPV6. Change-Id: I072cda6df02f49a94d94871a9f19800f106f49dc Closes-Bug: 1833459 Signed-off-by: Yi Wang <yi.c.wang@intel.com>	2019-06-28 15:03:18 +08:00
Al Bailey	609d84d846	Remove magnum from baremetal. Magnum is no longer packaged on bare metal. The sysinv and upgrades code related to magnum has been removed. The helm configuration for magnum remains, although it is not currently supported in containers either. The magnum-ui is not installed in platform or containerized horizon so the code to enable it is removed. Some upgrade code remains, due to the fact that that utility is in the process of being re-written. Story: 2004764 Task: 34333 Change-Id: I56873b4e04aac2e7d0cd57909beea00ecc2c1b9a Signed-off-by: Al Bailey <Al.Bailey@windriver.com>	2019-06-27 11:57:09 -05:00
Jerry Sun	4809c9f489	Upversion armada image Upversion armada image from existing af8a9ffd0873c2fbc915794e235dbd357f2adab1 to dd2e56c473549fd16f94212b553ed58c48d1f51b-ubuntu_bionic The specific image was chosen because it contained upstream armada commit df68a90e057c2e1e3427d6b8497b437c8a4c3b7e, which is a fix for keystone kubernetes auth. The ubuntu bionic image was chosen because the old image was an ubuntu bionic based image. Testing done by applying stx-openstack on standard, simplex, and duplex systems. Story: 2005860 Task: 33693 Change-Id: Ifd8a66d46e2dfd47ca7c5ab9807076ef43e67027 Signed-off-by: Jerry Sun <jerry.sun@windriver.com>	2019-06-21 09:47:40 -04:00
Al Bailey	99077bad0e	Cleanup ceilometer from bare metal code Ceilometer is being setup through helm charts in containers so the references to ceilometer in bare metal can be cleaned up. - Removing the sysinv puppet code for ceilometer - Removing the bare metal ceilometer pipeline upgrade script - Cleaning up unused variables from templates Story: 2004764 Task: 33690 Change-Id: I2efe7aed7a4570121c1376c132e157c6f47e9f29	2019-06-13 10:29:18 -05:00
Al Bailey	6c3afad3e7	Remove references to pacemaker from sysinv Sysinv use of pacemaker was replaced by SM a long time ago and the code that referenced it is being removed. Change-Id: Ic2a55698f64757bffeb9b53f4a105ea6ccb3dd2f Story: 2004764 Task: 30665 Signed-off-by: Al Bailey <Al.Bailey@windriver.com>	2019-06-06 11:56:45 -05:00
Sun Austin	384854568c	Disable raise/get/clear NFV alarm to container fm-rest-api add puppet parameters and write to nfv config file to disable raise/get/clear NFV alarm to container fm-rest-api service Story: 2004008 Task: 33573 Depends-On: https://review.opendev.org/#/c/658972/ Change-Id: I3ab37fe476ad083b5c8acca2684973eec30b8005 Signed-off-by: Sun Austin <austin.sun@intel.com>	2019-06-05 09:14:58 +08:00
SidneyAn	4f406285e4	update nfv-vim puppet runtime manifests and config files nfvi would raise openstack alarms/logs to the fm in pods, when it is availiable. following configure changes are required: 1. add "fault_mgmt_plugin_disabled" para in vim config file. Set it "True" when openstack application is not implement, and "False" when it is. 2. add "fault_mgmt_endpoint_disabled" para in alarm and event_log config file. rules are the same with "fault_mgmt_plugin_disabled" 3. add "openstack" and "fm" info to alarms and event_logs config file Story: 2004008 Task: 30954 Depends-On: https://review.opendev.org/#/c/661548/ Change-Id: Iee2a4515336f4ce9b6373d56d4f7a5779664233d Signed-off-by: SidneyAn <ran1.an@intel.com>	2019-06-04 09:02:50 +08:00
Zuul	87339ef708	Merge "Keystone DB sync - add service puppet module"	2019-05-07 20:25:11 +00:00
Zuul	2e966860bc	Merge "Provide env settings to allow zuul and developers to both run tox"	2019-05-02 15:52:49 +00:00
Andy Ning	aa61bc58ea	Keystone DB sync - add service puppet module This update adds the puppet package for keystone DB synchronization service. This puppet package will be used by controller puppet manifest to deploy and configure the synchronization service. Story: 2002842 Task: 22787 Signed-off-by: Andy Ning <andy.ning@windriver.com> (cherry picked from commit `51b20e03ea`) Conflicts: centos_pkg_dirs Depends-On: https://review.opendev.org/#/c/655727 Change-Id: I7059800daa053eaf975ad7f02200247d77653926	2019-04-30 14:20:37 -04:00
Al Bailey	65eaf645f4	Provide env settings to allow zuul and developers to both run tox Zuul checks out the dependant projects by their repo names. Repo checks out the project directory structure based on the labels in the manifest. Currently these directories have different names and so tox passes when run by zuul, but fails when run in a developer env. This submission uses an env variable: "STX_PREFIX" to make both envs able to run tox. Story: 2004515 Task: 30664 Change-Id: I06cefab7422f53ccc0b8af30ca06945311cec70e Signed-off-by: Al Bailey <Al.Bailey@windriver.com>	2019-04-30 09:18:46 -05:00
Al Bailey	0704edb6cc	Remove AODH and Gnocchi service parameters Removes the aodh service parameter alarm_history_time_to_live Removes references to aodh and gnocchi from puppet and upgrade code. Removes old gnocchi references from remote logging. Story: 2004764 Task: 30537 Change-Id: I3a03dd4a2afd47f1cc3f677f02d348eabf11a653 Signed-off-by: Al Bailey <Al.Bailey@windriver.com>	2019-04-30 08:00:19 -05:00
Tyler Smith	43381a8748	Renaming deprecated options and updating spec requirements - Renaming idle_timeout to connection_recycle_time since it was deprecated in Stein - Explicitly including required packages in the sysinv spec file to fix DC Change-Id: Ief055d26f3a1eb43b8cf144952a49e7e0f3ff939 Story: 2004765 Task: 28883 Depends-On: https://review.openstack.org/#/c/653086 Signed-off-by: Tyler Smith <tyler.smith@windriver.com>	2019-04-16 20:21:36 +00:00
Al Bailey	b899cf351e	Upversion Armada SHA to be a newer image Using SHA: af8a9ffd0873c2fbc915794e235dbd357f2adab1 which was built and tagged on April 9, 2019. The previous Armada SHA was from Sept 2018. The manifest.xml is updated to not generate armada warnings for libvirt, openvswitch, nova and neutron. The warning was: "label_selector" not specified, waiting with no labels may cause unintended consequences. Story: 2005198 Task: 30436 Change-Id: I97b633d9e6e1e4574e25dc8b69500faae4b4a809 Signed-off-by: Al Bailey <Al.Bailey@windriver.com>	2019-04-11 15:13:41 -05:00
Erich Cordoba	05a26e9061	Add notices on Intel authored files. Story: 2005265 Task: 30083 Change-Id: Ibcae6539747beb9d641e7d5eef4c4ff7574a8b13 Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>	2019-03-20 10:03:44 -06:00
Al Bailey	37b041a04c	Remove unused puppet modules * Remove the nova api proxy puppet module. * Remove openstack::swift puppet manifest. * Refactor openstack::nova::storage as platform::worker::storage. This requires the nova puppet code in sysinv to write to a different hiera target, and creation of /var/lib/nova. * Remove puppet modules from spec file for modules that are no longer being used. Story: 2004764 Task: 29840 Change-Id: Ifa0171b06e23fd77d373983d644df3f56ae4e2de Signed-off-by: Al Bailey <Al.Bailey@windriver.com>	2019-03-20 08:03:07 -05:00
Tao Liu	2c3e5963f3	Enable Distributed Cloud configuration The following changes are required to enable system controller and sub cloud configuration in a distributed cloud environment: * Remove references to os-keystone-region-name as the openstack patches that support it, have been removed. * Change the iptables rule for the NAT entry, to only apply, if the selected outgoing interface is the OAM interface. * Configure keystone endpoints, before configuring openrc on subclouds * Remove all openstack services, and users from the region config and update the tox * Disable nova, cinder and neutron api proxy Only tested distributed cloud configuration as multi-region configuration is not supported in the current release. Story: 2004766 Task: 30017 Change-Id: I5c43e2112f34225aa9e23ff777c5333ae77efcdc Signed-off-by: Tao Liu <tao.liu@windriver.com>	2019-03-14 17:48:44 -04:00
Mingyuan Qi	611a68a96a	Allow user specified registries for config_controller Currently docker images were pulled from public registries during config_controller. For some users, the connection to the public docker registry may be slow such that installing the containerized services images may timeout or the system simply does not have access to the public internet. This change allows users to specify alternative public/private registries to replace k8s.gcr.io, gcr.io, quay.io and docker.io. Insecure registry is supported if all default registries were replaced by one unified registry. It lowers the complexity for those who build his own registry without internet access. Docker doesn't support ipv6 addr as registry name, instead hostname or domain name in ipv6 network is allowed. Test: AIO-SX/AIO-DX/Standard(2+2): Alternative public registry (ipv4/domain) with proxy - config_controller pass Private registry (ipv4/ipv6/domain) without internet - config_controller pass Default registry with/without proxy - config_controller pass Story: 2004711 Task: 28742 Change-Id: I4fee3f4e0637863b9b5ef4ef556082ac75f62a1d Signed-off-by: Mingyuan Qi <mingyuan.qi@intel.com>	2019-02-23 10:10:07 +08:00
Erich Cordoba	ea0b33e950	Standardize makefiles for puppet-modules-wrs The puppet-modules-wrs is formed by several subcomponents, in all of them the same changes were applied: - Create a makefile with a install target. - Remove license file from build_srpm.data as is not needed. - Update target in specfile - Change autosetup to setup in specfile, this was bug in the spec files. Testing: - Verification on correct install paths. - config_controller complete on simplex configuration. Change-Id: I1512eb0c3034ffa2d57d098dab9800bdaba5b48d Story: 2004043 Task: 27552 Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>	2019-02-11 13:30:42 -06:00
Alex Kozyrev	f44717154a	Add Barbican bootstrap and runtime manifests Barbican service is needed during bootstrap phase for StarlingX. Implement bootstrap and runtime manifests to achieve that. Change-Id: I6c22ebddacf8aec3a731f7f6d7a762f79f511c78 Story: 2003108 Task: 27700 Signed-off-by: Alex Kozyrev <alex.kozyrev@windriver.com>	2019-01-11 13:33:00 -05:00
Don Penney	6fee40bd23	Update puppet module tox.ini files for puppet-lint When running "tox -e puppetlint" manually, the tox.ini will install puppet-lint via gem, but does not automatically install the json module upon which puppet-lint depends. This commit adds json to the gem install command. Change-Id: Ib8b6133395bf76748a8bcac0cb7bd718a89d6d5a Story: 2004515 Task: 28704 Signed-off-by: Don Penney <don.penney@windriver.com>	2019-01-02 15:27:09 -05:00
Don Penney	9a3264acaa	Fix additional puppet-lint warnings and errors This update addresses the following errors and warnings from puppet-lint: - 140chars - case_without_default - ensure_first_param - inherits_across_namespaces - parameter_order - single_quote_string_with_variables - variable_is_lowercase - variable_scope In the case of variable_is_lowercase, the compute.pp manifest has variables with sizes like 2M in the name. These have been left as-is, with lint:ignore comments for the check, due to the semantics of the name. For the 140chars check, certain long lines have been left as-is, with lint:ignore comments, due to long commands being executed. These can be revisited in a future update to try to break up the lines and remove the lint:ignore directives. Change-Id: I37809bacb43818e0956f9f434c30c48e05017325 Story: 2004515 Task: 28685 Signed-off-by: Don Penney <don.penney@windriver.com>	2018-12-27 16:23:13 -06:00
Don Penney	e6c0e0af8c	Fix puppet-lint warnings and errors This update addresses the following errors and warnings from puppet-lint, with most corrections done automatically using puppet-lint --fix: - 2sp_soft_tabs - arrow_alignment - arrow_on_right_operand_line - double_quoted_strings - hard_tabs - only_variable_string - quoted_booleans - star_comments - trailing_whitespace - variables_not_enclosed Change-Id: I7a2b0109534dd4715d459635fa33b09e7fd0a6a6 Story: 2004515 Task: 28683 Signed-off-by: Don Penney <don.penney@windriver.com>	2018-12-27 15:08:37 -06:00
Don Penney	a91160daa2	Add puppet-lint support This update adds the tox and zuul configuration to run puppet-lint against the puppet manifests. The initial update ignores all existing errors, which will be cleaned up later. Change-Id: I293abc2eac6bc6216cbbf6d939c1ba3474fb9384 Story: 2004515 Task: 28665 Signed-off-by: Don Penney <don.penney@windriver.com>	2018-12-24 13:50:20 -06:00
Tao Liu	6256b0d106	Change compute node to worker node personality This update replaced the compute personality & subfunction to worker, and updated internal and customer visible references. In addition, the compute-huge package has been renamed to worker-utils as it contains various scripts/services that used to affine running tasks or interface IRQ to specific CPUs. The worker_reserved.conf is now installed to /etc/platform. The cpu function 'VM' has also been renamed to 'Application'. Tests Performed: Non-containerized deployment AIO-SX: Sanity and Nightly automated test suite AIO-DX: Sanity and Nightly automated test suite 2+2 System: Sanity and Nightly automated test suite 2+2 System: Horizon Patch Orchestration Kubernetes deployment: AIO-SX: Create, delete, reboot and rebuild instances 2+2+2 System: worker nodes are unlock enable and no alarms Story: 2004022 Task: 27013 Change-Id: I0e0be6b3a6f25f7fb8edf64ea4326854513aa396 Signed-off-by: Tao Liu <tao.liu@windriver.com>	2018-12-13 14:15:55 -05:00
Bart Wensley	4a43480f6b	Configure VIM to use pod based OpenStack services When kubernetes is configured and the OpenStack application has been installed, the VIM will be configured to access the OpenStack services running in pods (keystone, nova, rabbitmq, etc...). In order to support this, some extensions were done to the sysinv helm code to allow parts of the OpenStack application configuration to be retrieved (e.g. endpoint info). Changes were also required to dnsmasq configuration to get resolution of pod based names (e.g. keystone.openstack.svc.cluster.local) working properly. This commit is just the first step and has limitations. There is no trigger to reconfigure the VIM after the OpenStack application has been installed - a controller lock/unlock is required. Story: 2003910 Task: 27852 Change-Id: I1c6dcdecd1365104457009196bbcf06b19c95489 Signed-off-by: Bart Wensley <barton.wensley@windriver.com>	2018-11-15 14:39:39 -06:00
Zuul	e237e94b80	Merge "Fix word and statement errors in comments"	2018-11-14 14:55:37 +00:00
zhangkunpeng	48699351c8	Fix word and statement errors in comments fix some typos in comments, such as the duplicated word 'the the'. Change-Id: I28ffde825fd95186bc3a0bd077dea7c20287fc1f Story: 2004164 Task: 27641 Signed-off-by: zhangkunpeng <zhang.kunpeng@99cloud.net>	2018-11-14 10:04:51 +08:00
Tee Ngo	d8d8851fa2	Armada-Sysinv integration Initial implementation of Armada integration with sysinv which entails: - Basic application upload via system application-upload command - Application install via system application-apply command - Application remove via system application-remove command - Application delete via system application-delete command - Application list and detail viewing via system application-list and application-show commands. This implementation does not cover the following functionalities that are either still under discussion or in planning: a) support for remote CLI where application tarball resides in the client machine b) support for air-gapped scenario/embedded private images c) support for custom apps' user overrides Tests conducted: - config controller - tox - functional tests (both Openstack and simple test app): - upload - apply - remove - delete - show - list - release group upgrade with user overrides - failure tests: - no tar file supplied - corrupted tar file - app already exists/does not exist - upload failure (missing manifest, multi manifests, no image tags, checksum test failure, etc...) - apply failure (nodes are not labeled, image download failure, etc...) - operation not permitted Change-Id: Iec27f356bd0047b2c7ef860ab3a2528f5a371868 Story: 2003908 Task: 26792 Signed-off-by: Tee Ngo <Tee.Ngo@windriver.com>	2018-11-07 07:52:35 -05:00
Tao Liu	485445def0	Fernet key synchronization This update contains the following changes for Distributed Cloud Fernet Key Synching & Management: 1.Disable key rotation cron job for distributed cloud 2.Add a fernet key repo config option in puppet sysinv 3.Add fernet repo sysinv APIs for create/update/retrieve keys 4.Add a fernet operator to create/update/retrieve the keys Story: 2002842 Task: 22786 Change-Id: Ia14caeef067fa481e3a4159c1658289250632779 Signed-off-by: Tao Liu <tao.liu@windriver.com>	2018-10-26 14:56:42 -05:00
Lachlan Plant	99323d74a9	Add logging configuration to nova-api-proxy Puppet manifests now include logging data to push to the conf file. This is needed for a subsequent code change to change the logging backend to oslo_log Change-Id: I303e199fd3c984af20564c43bdb98c460cbed0f1 Story: 2004007 Task: 27608 Signed-off-by: Lachlan Plant <lachlan.plant@windriver.com>	2018-10-22 12:50:44 -05:00
Kevin Smith	3a91cbae4d	Containerization, support 2 keystones in sysinv Support bare metal and pod based keystone in sysinv. The existing keystone_authtoken section of sysinv.conf remains and is used for platform service authentication, while openstack service authentication parameters are moved to a new openstack_keystone_authtoken section. Admin credentials are used in the new openstack_keystone_authtoken section and the region name parameters are also moved to this new section. Change-Id: I7a53dd5a2dc52213e0f1e0cc748649a33f0f9f40 Story: 2002876 Task: 26926 Signed-off-by: Kevin Smith <kevin.smith@windriver.com>	2018-10-11 14:26:48 -04:00
Zuul	2575c43911	Merge "Add configuration for containerized keystone to VIM"	2018-10-03 16:49:55 +00:00
Bart Wensley	e3c1fbed88	Add configuration for containerized keystone to VIM Adding configuration to the VIM for containerized keystone. The VIM will now support two keystone instances: - platform: bare metal keystone used to authenticate with platform services (e.g. sysinv, patching) - openstack: containerized keystone used to authenticate with openstack services (e.g. nova, neutron, cinder) For now, the same configuration will be used for both, as we still only deploy with the baremetal keystone. Story: 2002876 Task: 26872 Change-Id: If4bd46a4c14cc65978774001cb2887e5d3e3607b	2018-10-03 06:55:58 -05:00
Eric MacDonald	f5d212010b	Mtce: Add two new port definitions to mtc.ini for SM communications In support of the HA Improvements feature maintenance is required to, upon request, send SM a summary of maintenance's heartbeat responsiveness during the last 20 heartbeat periods. This update adds the required port assignments to the mtc.ini file in support of said communications. With this update the mtc.ini file will be updated to contain the following entries. ; Communication ports between SM and maintenance sm_server_port = 2124 ; port sm receives mtce commands from sm_client_port = 2224 ; port mtce receives sm commands from Change-Id: I05c022f7e4dcdeaea71bc0020641baa331daae57 Story: 2003576 Task: 26837 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2018-10-02 20:29:37 +00:00
Zuul	38027c134f	Merge "LLDP OVS enablement: puppet configuration"	2018-09-27 00:56:51 +00:00
Steven Webster	da1110a3d8	LLDP OVS enablement: puppet configuration This commit introduces puppet configuration enabling LLDP to operate over OVS. Specifically, separate ports flows are configured to handle LLDP traffic. In addition, we restrict the lldpd daemon from operating over bridge, tap, and ovs-netdev devices. Story: 2002946 Task: 22940 Change-Id: Ibadc9c082425412b5b68b02a55e8c02692de0e17 Signed-off-by: Steven Webster <steven.webster@windriver.com>	2018-09-26 11:11:42 -04:00
Kevin Smith	987e372465	Disable VIM plugins for Kubernetes deployment Do not load vim plugins and disable vim audits instead of just disabling the endpoints as was previously done in Change 599741. Leave setting of (new) Nova and (pre-existing) Neutron endpoint disabled flags for infrastructure host services usage. Story: 2002876 Task: 26573 Change-Id: Id3af829562e5765b99dbab23d913d65a4e6ec4a7 Signed-off-by: Kevin Smith <kevin.smith@windriver.com>	2018-09-24 09:29:35 -04:00
Tao Liu	a8acc56242	Sysinv healthy query API request failed The healthy query API request triggers sysinv to query the alarm list. The alarm query is attempted via a sysinv database API which is no longer supported. This results in the REST API request failure. This update contains the following changes to address the issue: 1.Add FM catalog info to sysinv puppet class and manifest 2.Add service catalog to the user request context 3.Add a FM client interface to communicate with FM API 4.Update the health query to retrieve the alarm list via FM client Closes-Bug: # 1789983 Change-Id: I31b256f6de22fe70cba59b08bf927c8b0ac119ee Signed-off-by: Tao Liu <tao.liu@windriver.com>	2018-09-13 13:36:15 -04:00
Zuul	1b95285b16	Merge "Mtce: Make Heartbeat Failure Action Configurable"	2018-09-11 13:41:12 +00:00
Eric MacDonald	5f232f6486	Mtce: Make Heartbeat Failure Action Configurable The current maintenance heartbeat failure action handling is to Fail and Gracefully Recover the host. This means that maintenance will ensure that a heartbeat failed host is rebooted/reset before it is recovered but will avoid rebooting it a second time if its recovered uptime indicates that it has already rebooted. This update expands that single action handling behavior to support three new actions. In doing so it adds a new configuration service parameter called heartbeat_failure_action. The customer can configure this new parameter with any one of the following 4 actions in order of decreasing impact. fail - Host is failed and gracefuly recovered. - Current Network specific alarms continue to be raised/cleared. Note: Prior to this update this was standard system behavior. degrade - Host is only degraded while it is failing heartbeat. - Current Network specific alarms continue to be raised/cleared. - heartbeat degrade reason is cleared as are the alarms when heartbeat responses resume. alarm - The only indication of a heartbeat failure is by alarm. - Same set of alarms as in above action cases - Only in this case no degrade, no failure, no reboot/reset none - Heartbeat is disabled ; no multicase heartbeat message is sent. - All existing heartbeat alarms are cleared. - The heartbeat soak as part of the enable sequence is bypassed. The selected action is a system wide setting. The selected setting also applies to Multi-Node Failure Avoidance. The default action is the legacy action Fail. This update also 1. Removes redundant inservice failure alarm for MNFA case in support of degrade only action. Keeping it would make that alarm handling case unnecessarily complicated. 2. No longer used 'hbs calibration' code is removed (cleanup). 3. Small amount of heartbeat logging cleanup. Test Plan: PASS: fail: Verify MNFA and recovery PASS: fail: Verify Single Host heartbeat failure and recovery PASS: fail: Verify Single Host heartbeat failure and recovery (from none) PASS: degrade: Verify MNFA and recovery PASS: degrade: Verify Single Host heartbeat failure and recovery PASS: degrade: Verify Single Host heartbeat failure and recovery (from alarm) PASS: alarm: Verify MNFA and recovery PASS: alarm: Verify Single Host heartbeat failure and recovery PASS: alarm: Verify Single Host heartbeat failure and recovery (from degrade) PASS: none: Verify heartbeat disable, fail ignore and no recovery PASS: none: Verify Single Host heartbeat ignore and no recovery PASS: none: Verify Single Host heartbeat ignode and no recovery (from fail) PASS: Verify action change behavior from none to alarm with active MNFA PASS: Verify action change behavior from alarm to degrade with active MNFA PASS: Verify action change behavior from degrade to none with active MNFA PASS: Verify action change behavior from none to fail with active MNFA PASS: Verify action change behavior from fail to none with active MNFA PASS: Verify action change behavior from degrade to fail then MNFA timeout PASS: Verify all heartbeat action change customer logs PASS: verify heartbeat stats clear over action change PASS: Verify LO DOR (several large labs - compute and storage systems) PASS: Verify recovery from failure of active controller PASS: Verify 3 host failure behavior with MNFA threshold at 3 (action:fail) PASS: Verify 2 host failure behavior with MNFA threshold at 3 (action:fail) Change-Id: I198505fb7a923cc760b12082acff1e5bac929ef2 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2018-09-10 08:55:16 -04:00
Kevin Smith	e8f97939b0	Disable VIM monitoring of Openstack services for Kubernetes deployment Story: 2002843 Task: 22790 Change-Id: I0a2b1c2a3799ab6e5c4a5a78cf60895e22469782 Signed-off-by: Kevin Smith <kevin.smith@windriver.com>	2018-09-04 14:27:52 -04:00
Eric MacDonald	f19dd0498f	Mtce: Make Multi-Node Failure Avoidance Configurable The maintenance system implements a high availability (HA) feature designed to detect the simultaneous heartbeat failure of a group of hosts and avoid failing all those hosts until heartbeat resumes or after a set period of time. This feature is called Multi-Node Failure Avoidance, aka MNFA, and currently has the hosts threshold set to 3 and timeout set to 100 secs. This update implements enhancements to that existing feature by making the 'number-of-hosts threshold' and 'timeout period' customer configurable service parameters. The new service parameters are listed under platform:maintenance which display with the following command > system service-parameter-list mnfa_threshold: This new label and value is added to the puppet managed /etc/mtc.ini and represents the number of hosts that are required to fail heartbeat as a group; within the heartbeat failure window (heartbeat_failure_threshold) after which maintenance activates MNFA Mode. This update changes the default number of failing hosts from 3 to 2 while allowing a configurable range from 2 to 100. mnfa_timeout: This new label and value is added to the puppet managed /etc/mtc.ini. While MNFA mode is active, it will remain active until the number of failing hosts drop below the mnfa_threshold or this timer expires. The MNFA mode deactivates on the first occurance of either case. Upon deactivation the remaining failed hosts are no longer treated as a failure group but instead are all Gracefully Recovered individually. A value of zero imposes no timeout making the deactivation criteria solely host based. This update changes the default 100 second timer to 0; no-timeout while permitting valid a times range from 100 to 86400 secs or 1 day. DocImpact Story: 2003576 Task: 24903 Change-Id: I2fb737a4cd3c235845b064449949fcada303d6b2 Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>	2018-08-31 10:43:25 -04:00
Tao Liu	5421df7098	Decouple Fault Management from stx-config List of changes: 1.Remove all fault management (FM) database tables from sysinv DB 2.Remove all FM commands from sysinv REST API service 3.Remove all FM CLI commands from cgts client 4.Add FM user to config controller to support region config 5.Update backup restore to reference the new alarm database table 6.Update controller config test files and add the new FM user 7.Add a FM puppet module in order to manage configuration data and database; to configure user, service and endpoint in Keystone 8.Add a FM puppet operator to populate FM and SNMP configuration data 9.Update NFV puppet to support FM endpoint configuration 10.Update haproxy manifest to support active-active FM API service Story: 2002828 Task: 22747 Change-Id: I96d22a18d5872c2e5398f2e9e26a7056fe9b4e82 Signed-off-by: Tao Liu <tao.liu@windriver.com>	2018-08-16 17:24:19 -04:00
Kevin Smith	38be5431c4	Change permission and ownership on dcorch files Change puppet to set appropriate dcorch ownerships and privileges for /etc/dcorch/api-paste.ini and /etc/dcorch/dcorch.conf Story: 2002992 Task: 23006 Change-Id: I5be797de8bb9d8a7e73b9b7888e155f9f103e7fd Signed-off-by: Kristine Bujold <kristine.bujold@windriver.com>	2018-08-13 16:59:47 -04:00

1 2

57 Commits