Commit Graph

40 Commits

Author SHA1 Message Date
Erich Cordoba 54a16057ff Remove version from sm-db folder
The sm-db component had the 1.0.0 version in the folder name, this
change removes that version and updates the centos_pkg_dirs.

Story: 2006623
Task: 36829

Depends-On: https://review.opendev.org/#/c/685127
Change-Id: Ia6025337529f4f48a89c175bb524548d81bc993f
Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>
2019-09-26 14:08:15 -05:00
Erich Cordoba c8691c93d8 Fix format-truncation warnings in sm
Building sm is not possible in opensuse as the code present
format-truncation warnings and the opensuse's build system
enforces the -Werror flag.

The solution is to define the proper string lengths.

  - SM_INTERFACE_NAME_MAX_CHAR was set to IFNAMSIZ.
  - SM_SERVICE_ACTION_PLUGIN_EXIT_CODE_MAX_CHAR increase to 32.
  - SM_SERVICE_HEARTBEAT_ADDRESS_MAX_CHAR decrease to 108.

These changes were updated in the database schema as well.

Story: 2006523
Task: 36551

Change-Id: Icce1d912c147fc6caaf06cc93de3cddadbcb0720
Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>
2019-09-11 12:34:54 -05:00
Zuul 2d04c4e428 Merge "dcdbsync for containerized openstack services - SM" 2019-08-12 16:24:09 +00:00
Andy Ning b64d3a384b dcdbsync for containerized openstack services - SM
This update added the dcdbsync service for containerized openstack
services into SM. Note that this second dcdbsync instance is also
running on platform (not containerized)

Story: 2004766
Task: 36099
Change-Id: If406127d26d6230771c0d44105da3a08facf3277
Signed-off-by: Andy Ning <andy.ning@windriver.com>
2019-08-06 16:36:01 -04:00
Kristine Bujold 4ef138fcf1 Collapse the glance filesystem into platform
The filesystem /opt/cgcs is removed and the “helm_charts” and “keystone”
folders now resides under /opt/platform.

  ls /opt/platform/
  armada  config  helm  nfv  puppet  sysinv

  ls /opt/cgcs/
  helm_charts  keystone

Resources related to cgcs-drbd and /opt/cgcs are removed from puppet.
SMS is no longer monitoring these resources.

Tested in AIO-SX, AIO-DX and Standard hardware labs.

Depends-On: https://review.opendev.org/674360
Partial-Bug: 1830142

Change-Id: I4be7a877efb89bb9e5c2b067bdc7e4259f2b0c0c
Signed-off-by: Kristine Bujold <kristine.bujold@windriver.com>
2019-08-02 13:35:54 -04:00
Stefan Dinescu 95367fd675 Increase SM timeout for ceph-mon
Note: this only affects AIO-DX setups as that is the only kind
      of setup where ceph-mon is managed by SM

In some edge-cases, during a swact, ceph-mon may take too long
to be stopped on the active controller resulting in a failed
swact.

This change increases the timeout to account for those
edge cases.

Change-Id: I3ace73650e4fe9aafc84c82e2ffe048f2039305e
Partial-bug: 1836075
Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>
2019-07-25 14:59:03 +03:00
Zuul 026d5fd730 Merge "Add mtc-agent service dependency to fm-mgr" 2019-07-18 16:34:39 +00:00
Bin Qian a729bbabc6 Add mtc-agent service dependency to fm-mgr
Add mtc-agent service dependency to fm-mgr to ensure mtc-agent shuts
down before fm-mgr does.

An issue was found that in rare cases a swact occurs when mtc-agent
try to clear an alarm, while fm-mgr has been disabled, clear alarm
message went lost. The alarm therefor remained not being able to
clear.

Closes-bug 1829289

Change-Id: I39196d5f3ce764a14b4d1e0fb1a4f3344ddd6a1a
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2019-07-12 13:00:13 -04:00
Mingyuan Qi 85b0ec621b Add floating ip for ironic network
This commit adds ironic-ip service to sm_db for ironic floating ip.

Story: 2004760
Task: 35689

Change-Id: I45039427cc5c96fd0639cf086d7e431244c4e1d9
Signed-off-by: Mingyuan Qi <mingyuan.qi@intel.com>
2019-07-09 10:08:55 +08:00
Zuul 0259011983 Merge "Keystone DB sync - add sync service into sm db" 2019-05-07 20:25:07 +00:00
Andy Ning 438b5d25da Keystone DB sync - add sync service into sm db
This update added dcorch dbsync service into sm db so that it can be
managed by SM.

Story: 2002842
Task: 22787

Signed-off-by: Andy Ning <andy.ning@windriver.com>
(cherry picked from commit db74249195)

Conflicts:
	service-mgmt/sm-db-1.0.0/database/create_sm_db.sql

Depends-On: https://review.opendev.org/#/c/655768
Change-Id: I2402ac4f09df6a4ac123c3f8e521d590e5745c23
2019-04-30 14:37:41 -04:00
dbadea d4fca7b9c1 sm-db: make ceph-radosgw failures minor instead of critical
Make ceph-radosgw failures less critical so the system has
a chance to bring up Ceph cluster.

Story: 2003605
Task: 28860

Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
Depends-On: I1a04f2d33313e77395f269db56484c1e55d9c3bb
Change-Id: Iecc56c4e11a2e1e164b18eed1ffb0d43099d6081
2019-04-25 18:45:58 +00:00
Changcheng Liu 941c60519d sm-db: replace ceph-rest-api with mgr-restful-plugin
The source code of ceph-rest-api has been removed from v13.2.2
and is implemented in v13.2.2/ceph-mgr/restful plugin. The
restful plugin will be enabled in mgr-restful-plugin service.

Story: 2003605
Task: 28860

Co-Authored-By: Yong Hu <yong.hu@intel.com>
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
Depends-On: Ibfbecf0a8beb38009b9d7192ca9455a841402040
Depends-On: I31fb9aac89c44bbce24939197446caa987d395cb
Change-Id: I1a04f2d33313e77395f269db56484c1e55d9c3bb
2019-04-24 15:14:48 +00:00
Zuul 6db625274e Merge "SM monitoring for cluster-host-ip service" 2019-04-11 14:56:35 +00:00
Teresa Ho 5232bdd8fd SM monitoring for cluster-host-ip service
Added service domain for cluster host interface and service for
cluster host IP in the SM database.
Removed references of infrastructure interface.

Story: 2004273
Task: 29474

Change-Id: I6223047e9453eba83ea8b4ecf4db739d0f7d7665
Signed-off-by: Teresa Ho <teresa.ho@windriver.com>
2019-04-11 07:43:06 -04:00
Chris Friesen 0e0de25743 Add SM DB entries for monitor for containerized mariadb
For the AIO-DX configuration we have a problem where we can't run with
a single mariadb replica and we can't run with two mariadb replicas and
a garbd daemon like we do on the full configuration.  (See the commit
message for the Depends-On commit for more info.)

Accordingly, add entries for the database monitor into the SM database.

Change-Id: Ibd28fb99eebb8b72a205d2626e49c4e2db4f01dd
Story: 2004712
Task: 30410
Signed-off-by: Chris Friesen <chris.friesen@windriver.com>
Depends-On: https://review.openstack.org/650288
2019-04-09 16:41:41 -04:00
Al Bailey 50b1e46310 Remove openstack services from SM
Removing the following services
  aodh-api
  aodh-evaluator
  aodh-listener
  aodh-notifier
  ceilometer-agent-notification
  cinder-api
  cinder-backup
  cinder-scheduler
  cinder-volume
  glance-api
  glance-registry
  gnocchi-api
  gnocchi-metricd
  heat-api
  heat-api-cfn
  heat-api-cloudwatch
  heat-engine
  ironic-api
  ironic-conductor
  ironic-tftp-ip
  magnum-api
  magnum-conductor
  murano-api
  murano-engine
  murano-rabbit
  neutron-server
  nova-api
  nova-api-proxy
  nova-compute
  nova-conductor
  nova-console-auth
  nova-novnc
  nova-placement-api
  nova-scheduler
  nova-serialproxy
  panko-api

The integer indices have not been reset.

SM seems to work even with the gaps in the index counters.
The counter values should be cleaned up in a later
review.

Story: 2004764
Task: 30022
Change-Id: Ie4cb66cbcfb5bea06f91e782692195a0280f7176
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
2019-03-20 11:57:30 -05:00
Zuul ac2ec50d86 Merge "ceph-mon failed on active controller after a failed swact" 2019-03-18 12:43:56 +00:00
Elena Taivan 6c52d2b276 ceph-mon failed on active controller after a failed swact
The reason why swact is failing is because ceph-radosgw is sometimes
enabled before ceph-mon has started.
Since ceph-mon is not running, ceph-radosgw will be disabled.
The solution implemented is to add dependencies so that
ceph-radosgw starts after ceph-mon and stops before ceph-mon.

Closes-Bug: 1818095
Change-Id: I4cd600422ff3cb2edc1e9f21e9b827e99731af37
Signed-off-by: Elena Taivan <elena.taivan@windriver.com>
2019-03-14 10:30:25 +00:00
Jerry Sun f83818d2bf Add Docker Registry Token Server SM DB Entries
As part of setting up Keystone authentication with Docker registry, a
token server needs to be deployed and managed by SM. This commit adds
the entries for the Docker registry token server into the SM database.

Story: 2002840
Task: 22783
Depends-On: https://review.openstack.org/#/c/641358/

Change-Id: I8aee0bf04ea8180cec333cc68b5b55db04b776c1
Signed-off-by: Jerry Sun <jerry.sun@windriver.com>
2019-03-06 08:53:39 -05:00
Angie Wang 27113df5e8 Helm repository replication
This commit adds a new resource helmrepository-fs and its dependency
in SM.

Story: 2004520
Task: 28343
Change-Id: Ia2be180a15b27f08f8fa7a8ac32f8612be8a258d
Signed-off-by: Angie Wang <angie.wang@windriver.com>
2019-01-16 09:29:44 -05:00
zhipengl e406346c98 Refactor patches for rabbitmq-server package
After using rabbitmq-server-config package to package config files
for rabbitmq-server package, we can remove related patches of it. 
We need to let SM to call our custom ocf file instead of default one
installed by rabbitmq-server package.

Story: 2003768
Task: 28042

Change-Id: I12e132831d064daa00f26d44df505870a71c45ef
Signed-off-by: zhipengl <zhipengs.liu@intel.com>
2018-12-18 02:38:50 +00:00
Zuul e4a8db70a9 Merge "CEPH support for 2 node configuration" 2018-12-06 14:50:30 +00:00
Stefan Dinescu 358cece3f4 CEPH support for 2 node configuration
In order to enable Openstack's helm charts on StarlingX we need
a distributed persistent storage for Kubernetes that leverages
our existing configurations.

Changes made:
- add to SM entries for monitoring ceph-mon, ceph-osd
  and drbd-ceph
- the ceph monitor is a floating monitor between the
  two controllers backed by a DRBD replicated filesystem

Change-Id: I8f9ea4798070e08171ad73da39821bc20b7af231
Co-Authored-By: Ovidiu Poncea <ovidiu.poncea@windriver.com>
Story: 2002844
Task: 26878
Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>
2018-12-04 18:13:05 +02:00
Alex Kozyrev 2ef5e82555 SM management for Barbican in HA environment.
Add Barbican processes into SM database for its management in HA.
SM will be able to manage Barbican processes via an OCF scripts.
 Barbican processes are: api, worker and keystone-listener).

Change-Id: I1d037b2051dd9febf3d73efe4032b957ab32c187
Story: 2003108
Task: 27700
Depends-On: I6b0b0c90456627bebde2b834b339bc968100b6f9
Signed-off-by: Alex Kozyrev <alex.kozyrev@windriver.com>
2018-12-01 14:45:53 -05:00
Zuul d7ba498da9 Merge "Implement Active-Active Heartbeat as HA Improvement" 2018-11-21 16:46:23 +00:00
Eric MacDonald b616bfb83d Implement Active-Active Heartbeat as HA Improvement
This update removes the hbsAgent from the SM database so
that it is no longer an SM managed/monitored process.

Story: 2003576
Task: 24907

Depends-On: https://review.openstack.org/#/c/617835
Change-Id: I12a8776d45c680dfe2fa9b8e2bf7cf3897ca9e1b
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2018-11-14 20:18:39 +00:00
Zuul b1aad31b96 Merge "split-brain avoidance improvement" 2018-11-09 14:48:57 +00:00
Bin Qian 133da10b08 split-brain avoidance improvement
This change enables one way communication via BMC (if configured)
through mtce.
when 2 controllers lost all communications to each other.
The algorithm is:
when communications all lost,
both active and standby controllers, verify its interfaces (mgmt,
infra, and oam)
if active controller is healthy, it will request a bmc reset
thorugh mtce, against standby controller.
if standby controller is healthy, it will active itself and wait
a total 45 seconds before requesting a bmc reset through mtce,
against the active controller.

Changes also include:
1. adding new initial failover state.
   initial state is a state before the node is enabled
2. remove failover thread.
   using worker thread action to perform time consuming operations
3. remove entire failover action table

Story: 2003577
Task:  24901
Change-Id: I7d294d40e84469df6b6a6f6dd490cf3c4557b711
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2018-11-08 20:18:43 +00:00
Bin Qian 01e3a2be4b Add new configuration for hbs cluster info
Add default of 2 new system configurations to sm.db.
Add configuration code to enable configure the options.

Story: 2003577
Task: 27727

Change-Id: I1a3dbaec78de591ef0de75ef9cafab48bcbb4dbf
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2018-11-02 19:27:10 -04:00
Wei Zhou e641605452 Enable upstream Cinder volume backup/restore capabilities
Add cinder-backup as a SM service

Story: 2003115
Task: 26393

Depends-On: I63a6fd3d112a98449ea22524bb2a83b5db8ce6d1
Change-Id: I85b9f791f6ce80a150909ff46d134e4b455a220e
Signed-off-by: Wei Zhou <wei.zhou@windriver.com>
2018-09-21 13:01:37 +00:00
Bin Qian e7d2713327 Unexpected swact after rebooting standby controller
In All-in-one duplex system, 2 controllers sometimes require longer
than expected (2500ms) to sync the status of the service groups.
The fix here is to extend the wait interval to 5000ms in order to give
enough time for the controllers to finish serivce groups status
exchange.
The extra wait time impacts only the 1st controller reboot.
In a normal scenario, the 2nd controller boots up the service
scheduling will start immediately after the service group status
exchange completed.

Story: 2003287
Task: 24230

Change-Id: Ibc49df8278ec0a0b319691ae9e82e9318a35a2f6
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2018-08-08 17:06:45 -04:00
Angie Wang b7d84691a0 Manage gnocchi services in SM
Remove ceilometer services and add gnocchi services in SM.

Change-Id: I5cc7b5c4782c557196a8360456540eb907362c7c
Signed-off-by: Don Penney <don.penney@windriver.com>
Signed-off-by: Jack Ding <jack.ding@windriver.com>
2018-08-02 01:46:59 +00:00
Jerry Sun b8b80494ee Make SM manage docker local registry.
This is part of Local Docker Registry HA Integration.

Story: 2002840
Task: 22782

Change-Id: I26edb7705243a9ccb54253d941912756424a0928
Signed-off-by: Don Penney <don.penney@windriver.com>
Signed-off-by: Jack Ding <jack.ding@windriver.com>
2018-08-02 01:46:38 +00:00
Shoaib Nasir 98b9028922 Configure SM service for drbd synch'd filesystem for ETCD
Story: 2002841
Task: 22784
Depends-On: https://review.openstack.org/587427

Change-Id: Id458e8abaf224605261af23508037a649654e775
Signed-off-by: Don Penney <don.penney@windriver.com>
Signed-off-by: Jack Ding <jack.ding@windriver.com>
2018-07-31 21:35:55 +00:00
Shoaib Nasir ab4617b922 Configure SM to manage ETCD service
Story: 2002841
Task: 22784

Change-Id: I40863a7e2e8e68d232b3e0c69414587bd0b1de9d
Signed-off-by: Jack Ding <jack.ding@windriver.com>
2018-07-31 09:27:01 -04:00
Kam Nasim 032fb05039 Configure DC Orch Identity Proxy SM service
This is part of Distributed Keystone for Distributed Cloud feature.

Story: 2002842
Task: 22785

Change-Id: I0d0e60cc7482a451ba231066b4a470166238a3a7
Signed-off-by: Jack Ding <jack.ding@windriver.com>
2018-07-31 09:26:34 -04:00
Bin Qian e450db9ac1 Extend glance-registry timeout
The glance-registry timeout is insufficient often resulting in swact
failure. Extend the glance-registry enable/enable-audit action timeout
to 60 seconds.

Change-Id: Ie951ca413b3829807b6453e9ab96538417bd65ec
Signed-off-by: Kristine Bujold <kristine.bujold@windriver.com>
2018-06-28 11:17:56 -04:00
Bin Qian 1cb543c690 Recreate sm db script for fresh new release
Change-Id: I672869e22df8ee9f974d8c9be3586e31663ea3e2
Signed-off-by: Kristine Bujold <kristine.bujold@windriver.com>
2018-06-27 13:44:58 -04:00
Dean Troyer 17c909ec83 StarlingX open source release updates
Signed-off-by: Dean Troyer <dtroyer@gmail.com>
2018-05-31 07:36:26 -07:00