The sm-db component had the 1.0.0 version in the folder name, this
change removes that version and updates the centos_pkg_dirs.
Story: 2006623
Task: 36829
Depends-On: https://review.opendev.org/#/c/685127
Change-Id: Ia6025337529f4f48a89c175bb524548d81bc993f
Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>
Building sm is not possible in opensuse as the code present
format-truncation warnings and the opensuse's build system
enforces the -Werror flag.
The solution is to define the proper string lengths.
- SM_INTERFACE_NAME_MAX_CHAR was set to IFNAMSIZ.
- SM_SERVICE_ACTION_PLUGIN_EXIT_CODE_MAX_CHAR increase to 32.
- SM_SERVICE_HEARTBEAT_ADDRESS_MAX_CHAR decrease to 108.
These changes were updated in the database schema as well.
Story: 2006523
Task: 36551
Change-Id: Icce1d912c147fc6caaf06cc93de3cddadbcb0720
Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>
This update added the dcdbsync service for containerized openstack
services into SM. Note that this second dcdbsync instance is also
running on platform (not containerized)
Story: 2004766
Task: 36099
Change-Id: If406127d26d6230771c0d44105da3a08facf3277
Signed-off-by: Andy Ning <andy.ning@windriver.com>
The filesystem /opt/cgcs is removed and the “helm_charts” and “keystone”
folders now resides under /opt/platform.
ls /opt/platform/
armada config helm nfv puppet sysinv
ls /opt/cgcs/
helm_charts keystone
Resources related to cgcs-drbd and /opt/cgcs are removed from puppet.
SMS is no longer monitoring these resources.
Tested in AIO-SX, AIO-DX and Standard hardware labs.
Depends-On: https://review.opendev.org/674360
Partial-Bug: 1830142
Change-Id: I4be7a877efb89bb9e5c2b067bdc7e4259f2b0c0c
Signed-off-by: Kristine Bujold <kristine.bujold@windriver.com>
Note: this only affects AIO-DX setups as that is the only kind
of setup where ceph-mon is managed by SM
In some edge-cases, during a swact, ceph-mon may take too long
to be stopped on the active controller resulting in a failed
swact.
This change increases the timeout to account for those
edge cases.
Change-Id: I3ace73650e4fe9aafc84c82e2ffe048f2039305e
Partial-bug: 1836075
Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>
Add mtc-agent service dependency to fm-mgr to ensure mtc-agent shuts
down before fm-mgr does.
An issue was found that in rare cases a swact occurs when mtc-agent
try to clear an alarm, while fm-mgr has been disabled, clear alarm
message went lost. The alarm therefor remained not being able to
clear.
Closes-bug 1829289
Change-Id: I39196d5f3ce764a14b4d1e0fb1a4f3344ddd6a1a
Signed-off-by: Bin Qian <bin.qian@windriver.com>
This update added dcorch dbsync service into sm db so that it can be
managed by SM.
Story: 2002842
Task: 22787
Signed-off-by: Andy Ning <andy.ning@windriver.com>
(cherry picked from commit db74249195)
Conflicts:
service-mgmt/sm-db-1.0.0/database/create_sm_db.sql
Depends-On: https://review.opendev.org/#/c/655768
Change-Id: I2402ac4f09df6a4ac123c3f8e521d590e5745c23
Make ceph-radosgw failures less critical so the system has
a chance to bring up Ceph cluster.
Story: 2003605
Task: 28860
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
Depends-On: I1a04f2d33313e77395f269db56484c1e55d9c3bb
Change-Id: Iecc56c4e11a2e1e164b18eed1ffb0d43099d6081
The source code of ceph-rest-api has been removed from v13.2.2
and is implemented in v13.2.2/ceph-mgr/restful plugin. The
restful plugin will be enabled in mgr-restful-plugin service.
Story: 2003605
Task: 28860
Co-Authored-By: Yong Hu <yong.hu@intel.com>
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
Depends-On: Ibfbecf0a8beb38009b9d7192ca9455a841402040
Depends-On: I31fb9aac89c44bbce24939197446caa987d395cb
Change-Id: I1a04f2d33313e77395f269db56484c1e55d9c3bb
Added service domain for cluster host interface and service for
cluster host IP in the SM database.
Removed references of infrastructure interface.
Story: 2004273
Task: 29474
Change-Id: I6223047e9453eba83ea8b4ecf4db739d0f7d7665
Signed-off-by: Teresa Ho <teresa.ho@windriver.com>
For the AIO-DX configuration we have a problem where we can't run with
a single mariadb replica and we can't run with two mariadb replicas and
a garbd daemon like we do on the full configuration. (See the commit
message for the Depends-On commit for more info.)
Accordingly, add entries for the database monitor into the SM database.
Change-Id: Ibd28fb99eebb8b72a205d2626e49c4e2db4f01dd
Story: 2004712
Task: 30410
Signed-off-by: Chris Friesen <chris.friesen@windriver.com>
Depends-On: https://review.openstack.org/650288
Removing the following services
aodh-api
aodh-evaluator
aodh-listener
aodh-notifier
ceilometer-agent-notification
cinder-api
cinder-backup
cinder-scheduler
cinder-volume
glance-api
glance-registry
gnocchi-api
gnocchi-metricd
heat-api
heat-api-cfn
heat-api-cloudwatch
heat-engine
ironic-api
ironic-conductor
ironic-tftp-ip
magnum-api
magnum-conductor
murano-api
murano-engine
murano-rabbit
neutron-server
nova-api
nova-api-proxy
nova-compute
nova-conductor
nova-console-auth
nova-novnc
nova-placement-api
nova-scheduler
nova-serialproxy
panko-api
The integer indices have not been reset.
SM seems to work even with the gaps in the index counters.
The counter values should be cleaned up in a later
review.
Story: 2004764
Task: 30022
Change-Id: Ie4cb66cbcfb5bea06f91e782692195a0280f7176
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
The reason why swact is failing is because ceph-radosgw is sometimes
enabled before ceph-mon has started.
Since ceph-mon is not running, ceph-radosgw will be disabled.
The solution implemented is to add dependencies so that
ceph-radosgw starts after ceph-mon and stops before ceph-mon.
Closes-Bug: 1818095
Change-Id: I4cd600422ff3cb2edc1e9f21e9b827e99731af37
Signed-off-by: Elena Taivan <elena.taivan@windriver.com>
As part of setting up Keystone authentication with Docker registry, a
token server needs to be deployed and managed by SM. This commit adds
the entries for the Docker registry token server into the SM database.
Story: 2002840
Task: 22783
Depends-On: https://review.openstack.org/#/c/641358/
Change-Id: I8aee0bf04ea8180cec333cc68b5b55db04b776c1
Signed-off-by: Jerry Sun <jerry.sun@windriver.com>
This commit adds a new resource helmrepository-fs and its dependency
in SM.
Story: 2004520
Task: 28343
Change-Id: Ia2be180a15b27f08f8fa7a8ac32f8612be8a258d
Signed-off-by: Angie Wang <angie.wang@windriver.com>
After using rabbitmq-server-config package to package config files
for rabbitmq-server package, we can remove related patches of it.
We need to let SM to call our custom ocf file instead of default one
installed by rabbitmq-server package.
Story: 2003768
Task: 28042
Change-Id: I12e132831d064daa00f26d44df505870a71c45ef
Signed-off-by: zhipengl <zhipengs.liu@intel.com>
In order to enable Openstack's helm charts on StarlingX we need
a distributed persistent storage for Kubernetes that leverages
our existing configurations.
Changes made:
- add to SM entries for monitoring ceph-mon, ceph-osd
and drbd-ceph
- the ceph monitor is a floating monitor between the
two controllers backed by a DRBD replicated filesystem
Change-Id: I8f9ea4798070e08171ad73da39821bc20b7af231
Co-Authored-By: Ovidiu Poncea <ovidiu.poncea@windriver.com>
Story: 2002844
Task: 26878
Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>
Add Barbican processes into SM database for its management in HA.
SM will be able to manage Barbican processes via an OCF scripts.
Barbican processes are: api, worker and keystone-listener).
Change-Id: I1d037b2051dd9febf3d73efe4032b957ab32c187
Story: 2003108
Task: 27700
Depends-On: I6b0b0c90456627bebde2b834b339bc968100b6f9
Signed-off-by: Alex Kozyrev <alex.kozyrev@windriver.com>
This update removes the hbsAgent from the SM database so
that it is no longer an SM managed/monitored process.
Story: 2003576
Task: 24907
Depends-On: https://review.openstack.org/#/c/617835
Change-Id: I12a8776d45c680dfe2fa9b8e2bf7cf3897ca9e1b
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
This change enables one way communication via BMC (if configured)
through mtce.
when 2 controllers lost all communications to each other.
The algorithm is:
when communications all lost,
both active and standby controllers, verify its interfaces (mgmt,
infra, and oam)
if active controller is healthy, it will request a bmc reset
thorugh mtce, against standby controller.
if standby controller is healthy, it will active itself and wait
a total 45 seconds before requesting a bmc reset through mtce,
against the active controller.
Changes also include:
1. adding new initial failover state.
initial state is a state before the node is enabled
2. remove failover thread.
using worker thread action to perform time consuming operations
3. remove entire failover action table
Story: 2003577
Task: 24901
Change-Id: I7d294d40e84469df6b6a6f6dd490cf3c4557b711
Signed-off-by: Bin Qian <bin.qian@windriver.com>
Add default of 2 new system configurations to sm.db.
Add configuration code to enable configure the options.
Story: 2003577
Task: 27727
Change-Id: I1a3dbaec78de591ef0de75ef9cafab48bcbb4dbf
Signed-off-by: Bin Qian <bin.qian@windriver.com>
Add cinder-backup as a SM service
Story: 2003115
Task: 26393
Depends-On: I63a6fd3d112a98449ea22524bb2a83b5db8ce6d1
Change-Id: I85b9f791f6ce80a150909ff46d134e4b455a220e
Signed-off-by: Wei Zhou <wei.zhou@windriver.com>
In All-in-one duplex system, 2 controllers sometimes require longer
than expected (2500ms) to sync the status of the service groups.
The fix here is to extend the wait interval to 5000ms in order to give
enough time for the controllers to finish serivce groups status
exchange.
The extra wait time impacts only the 1st controller reboot.
In a normal scenario, the 2nd controller boots up the service
scheduling will start immediately after the service group status
exchange completed.
Story: 2003287
Task: 24230
Change-Id: Ibc49df8278ec0a0b319691ae9e82e9318a35a2f6
Signed-off-by: Bin Qian <bin.qian@windriver.com>
Remove ceilometer services and add gnocchi services in SM.
Change-Id: I5cc7b5c4782c557196a8360456540eb907362c7c
Signed-off-by: Don Penney <don.penney@windriver.com>
Signed-off-by: Jack Ding <jack.ding@windriver.com>
This is part of Local Docker Registry HA Integration.
Story: 2002840
Task: 22782
Change-Id: I26edb7705243a9ccb54253d941912756424a0928
Signed-off-by: Don Penney <don.penney@windriver.com>
Signed-off-by: Jack Ding <jack.ding@windriver.com>
This is part of Distributed Keystone for Distributed Cloud feature.
Story: 2002842
Task: 22785
Change-Id: I0d0e60cc7482a451ba231066b4a470166238a3a7
Signed-off-by: Jack Ding <jack.ding@windriver.com>
The glance-registry timeout is insufficient often resulting in swact
failure. Extend the glance-registry enable/enable-audit action timeout
to 60 seconds.
Change-Id: Ie951ca413b3829807b6453e9ab96538417bd65ec
Signed-off-by: Kristine Bujold <kristine.bujold@windriver.com>