config/sysinv/sysinv/sysinv/sysinv
Don Penney c138575062 Ceph initialization on AIO is done only in 'controller' manifests
On AIO deployments puppet is run twice with two different manifests:
1. 'controller': to configure controller services
2. 'worker': to configure worker services.

Ceph is configured when 'controller' manifests are applied, there is
no need to run them a second time, when 'worker' set is applied.

Commit adds new puppet classes to encapsulate ceph configuration
based on node personality and adds a check to not apply it a 2nd
time on controllers.

If the ceph manifests are executed a second time then we get into
a racing issue between SM's process monitoring and 'worker' puppet
manifests triggering a restart of ceph-mon as part of reconfiguration

After a reboot on AIO, SM takes control of ceph-mon monitoring
after 'controller' puppet manifests finish applying. As part of this,
SM monitors processes death notification and gets the pid from the
.pid file. And periodically executes '/etc/init.d/ceph status
mon.controller' for a more advanced monitoring.

When the 'worker' manifests are executed, they trigger a restart
of ceph-mon through /etc/init.d/ceph restart that has two steps: 'stop'
in which ceph-mon is stopped, and 'start' in which it is restarted.

In the first step, stopping ceph-mon leads to the death of ceph-mon
process and removal of its PID file. This is promptly detected by
SM which immediately triggers a start of ceph-mon that creates a
new pid file. Problem is that ceph-mon was already in a restart,
and at the end of the 'stop' step the init script cleans up the
new pid file instead of the old.

This leads to controllers swacting a couple of times before the system
gets rid of the rogue process.

Change-Id: I2a0df3bab716a553e71e322e1515bee2bb2f700d
Co-authored-by: Ovidiu Poncea <ovidiu.poncea@windriver.com>
Story: 2002844
Task: 29214
Signed-off-by: Ovidiu Poncea <ovidiu.poncea@windriver.com>
2019-02-10 21:22:41 +02:00
..
agent Remove nova storage aggregates 2019-01-25 15:38:43 -05:00
api Configurable Host HTTP/HTTPS Port Binding 2019-02-06 12:47:00 -06:00
cluster Fix: "import" issue for Python 2/3 compatible code 2018-12-25 08:58:03 +08:00
cmd Fix: others issues for Python 2/3 compatible code 2018-12-19 10:20:56 +08:00
common Merge "Configurable Host HTTP/HTTPS Port Binding" 2019-02-07 19:05:52 +00:00
conductor Configurable Host HTTP/HTTPS Port Binding 2019-02-06 12:47:00 -06:00
db Enable python3.5 sysinv unit test 2019-01-30 08:51:07 +08:00
helm Merge "Move nova static configs to Armada manifest" 2019-02-08 21:28:32 +00:00
objects Enable python3.5 sysinv unit test 2019-01-30 08:51:07 +08:00
openstack Enable python3.5 sysinv unit test 2019-01-30 08:51:07 +08:00
puppet Ceph initialization on AIO is done only in 'controller' manifests 2019-02-10 21:22:41 +02:00
tests Fix multi-net interface configuration 2019-02-01 09:21:26 -05:00
__init__.py StarlingX open source release updates 2018-05-31 07:35:52 -07:00
netconf.py StarlingX open source release updates 2018-05-31 07:35:52 -07:00
sanity_coverage.py Sysinv tox updates. Prepare for bandit reports and test reports 2018-06-29 13:25:09 -04:00
version.py StarlingX open source release updates 2018-05-31 07:35:52 -07:00