250 lines
7.7 KiB
ReStructuredText
250 lines
7.7 KiB
ReStructuredText
..
|
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
|
License. http://creativecommons.org/licenses/by/3.0/legalcode
|
|
|
|
===================================
|
|
Ceph upgrade from Mimic to Nautilus
|
|
===================================
|
|
|
|
Storyboard:
|
|
https://storyboard.openstack.org/#!/story/2009074
|
|
|
|
This story covers the upgrade of Ceph from Mimic to Nautilus. The upgrade also includes code and
|
|
configuration changes in StarlingX components that are needed to support Nautilus.
|
|
|
|
Official instructions about how to migrate from Mimic to Nautilus can be found in [1]_.
|
|
|
|
Problem description
|
|
===================
|
|
|
|
Mimic end of life happened in 2020-07-22. It is needed to choose an active release that supports
|
|
an automated version migration between releases (i.e. start MON/OSD/MDS and service data formatting
|
|
is migrated to new formats if required from Mimic version).
|
|
|
|
This will require to evaluate historic HA reliability code and remove/retire unneeded code,
|
|
enable new/default supported features (Bluestore, systemd service files, use ceph-volume instead
|
|
of ceph-disk for OSD deployment) and enable ease of future upgrades.
|
|
|
|
Use Cases
|
|
---------
|
|
|
|
Users should be able to have access to the same current storage features without noticing a
|
|
difference between ceph versions.
|
|
|
|
Proposed change
|
|
===============
|
|
|
|
Firstly we should focus in building a StarlingX ISO having Ceph Nautilus built.
|
|
|
|
Other choices such as Octopus or Pacific are ruled out because we want to align with what is currently supported
|
|
by Debian Bullseye which is Nautilus. In addition, Pacific only supports migration from Octopus or Nautilus [2]_.
|
|
|
|
After having the image built, we can evaluate the changes made in Ceph Mimic downstream and port those
|
|
that are needed for Ceph Nautilus downstream. Next, we should be able to install an AIO-SX
|
|
successfully having Ceph Nautilus built in it.
|
|
|
|
It will be needed to check integration of Ceph subsystems with the new Ceph version. The
|
|
subsystems are:
|
|
|
|
* ceph-manager
|
|
|
|
* python-cephclient
|
|
|
|
* mgr-restful-plugin
|
|
|
|
* puppet
|
|
|
|
* ansible playbooks
|
|
|
|
New features enablement
|
|
-----------------------
|
|
|
|
Having an ISO, we can verify the enablement of some new features such as:
|
|
|
|
* Switch OSDs to BlueStore from FileStore
|
|
|
|
* BlueStore is the default technology for OSDs (starting in Luminous) which improves performance over the
|
|
previous FileStore technology. More details can be found in [3]_.
|
|
|
|
* Switch from sysvinit services/HA scripts to systemd/HA driven services
|
|
|
|
* Ceph upstream uses systemd to control ceph process initialization. This was disabled in downstream to maintain
|
|
historical (Ceph Hammer and Ceph Jewel) use of sysvinit script optimizations.
|
|
|
|
* Migrate OSD deployment to use ceph-volume due to ceph-disk deprecation
|
|
|
|
Investigate differences between ceph versions
|
|
---------------------------------------------
|
|
|
|
It is possible that some commands were changed/deprecated between migration from Mimic to Nautilus. It will be
|
|
needed to verify what are those commands and see what are the difference and their impact in the overall system.
|
|
The impacts might happen in the following projects/modules:
|
|
|
|
* config:
|
|
|
|
* sysinv/cgts-client
|
|
* sysinv/sysinv
|
|
|
|
* stx-puppet:
|
|
|
|
* puppet-manifests/src/modules/platform/manifests/ceph.pp
|
|
|
|
* utilities:
|
|
|
|
* ceph/ceph-manager
|
|
* ceph/python-cephclient
|
|
|
|
* integ:
|
|
|
|
* ceph
|
|
* config/puppet-modules/openstack/puppet-ceph-2.2.0 - upgrade to 3.1.1
|
|
|
|
Alternatives
|
|
------------
|
|
|
|
Ceph Octopus might be an alternative if it shows up in Debian bullseye package list and if time permits.
|
|
|
|
Data model impact
|
|
-----------------
|
|
|
|
N/A
|
|
|
|
REST API impact
|
|
---------------
|
|
|
|
Impact will depend on the required changes of Nautilus commands.
|
|
|
|
Security impact
|
|
---------------
|
|
|
|
N/A
|
|
|
|
Other end user impact
|
|
---------------------
|
|
|
|
N/A
|
|
|
|
Performance Impact
|
|
------------------
|
|
|
|
* Performance improvement should happen when switching from FileStore OSDs to BlueStore.
|
|
* Replacement of ceph-disk by ceph-volume should increase reliability and improve performance.
|
|
Details to be verified in [4]_.
|
|
|
|
Other deployer impact
|
|
---------------------
|
|
|
|
N/A
|
|
|
|
Developer impact
|
|
----------------
|
|
|
|
N/A
|
|
|
|
Upgrade impact
|
|
--------------
|
|
|
|
It should be possible to upgrade to the next releases in a simpler way.
|
|
New features to be enabled should provide a better user experience.
|
|
|
|
Implementation
|
|
==============
|
|
|
|
Assignee(s)
|
|
-----------
|
|
|
|
Primary assignee:
|
|
Vinícius Lopes da Silva (viniciuslopesdasilva)
|
|
|
|
Other contributors:
|
|
- Delfino Gomes Curado Filho (dcuradof)
|
|
- Felipe Sanches Zanoni (fsanches)
|
|
- Mauricio Biasi do Monte Carmelo (mbiasido)
|
|
- Thiago Oliveira Miranda (thiagooliveiramiranda)
|
|
- Alan Kyoshi (akyoshi)
|
|
- Daniel Pinto Barros (dbarros)
|
|
|
|
Repos Impacted
|
|
--------------
|
|
|
|
- config
|
|
- integ
|
|
- stx-puppet
|
|
- ha
|
|
- ansible-playbook
|
|
- utilities
|
|
|
|
Work Items
|
|
----------
|
|
|
|
* Verify compatibility between Nautilus and Mimic. According to
|
|
`upgrade compatibility notes <https://docs.ceph.com/en/latest/releases/nautilus/#upgrade-compatibility-notes>`_,
|
|
there are some commands that have changed between versions and we should make sure
|
|
of the impact in current implementation.
|
|
|
|
* Current OSDs are FileStore based, Nautilus supports BlueStore OSDs. So it will be needed
|
|
to determine the feasibility of migrating from FileStore to BlueStore OSDs. It will also
|
|
be required to determine if FileStore and BlueStore OSDs can coexist.
|
|
|
|
* Current Ceph's default use of systemd to control ceph process initialization is
|
|
`disabled <https://github.com/starlingx-staging/stx-ceph/commit/ecbbc1c833106a1151c6ccb93eebbad93b55b2c2>`_. It should
|
|
be re-enabled and evaluate the changes to be done in
|
|
`init script <https://github.com/starlingx-staging/stx-ceph/commits/stx/v13.2.2/src/init-ceph.in>`_ and
|
|
`pmon <https://opendev.org/starlingx/integ/src/branch/master/ceph/ceph/files/ceph-init-wrapper.sh>`_.
|
|
|
|
* Currently ceph-disk is being used to deploy OSDs. Problem is ceph-disk is deprecated and we should
|
|
use ceph-volume in its place. This will require an investigation about the impacts of this change. In the worst case
|
|
scenario, it is possible to still use ceph-disk since this is available through the Ceph Pacific release
|
|
(latest to date).
|
|
|
|
* Evaluate code from `current patch <https://github.com/starlingx-staging/stx-ceph/commits/stx/v13.2.2>`_ set applied on
|
|
Mimic and port the relevant patches to Nautilus branch.
|
|
|
|
* Ensure integration between Ceph and its subsystems (ceph-manager, python-cephclient, mgr-restful-plugin, Puppet code,
|
|
ansible-playbooks) are working correctly.
|
|
|
|
Dependencies
|
|
============
|
|
|
|
N/A
|
|
|
|
Testing
|
|
=======
|
|
|
|
All validation activities should pass Sanity/Storage regression tests.
|
|
|
|
Standard configurations scenarios
|
|
---------------------------------
|
|
* AIO-SX
|
|
* AIO-DX
|
|
* Standard 2C+2W
|
|
* Storage 2C+2S+2W
|
|
* Storage Tiers - Can be done on AIO-SX, should be valid across all installs
|
|
|
|
Additional scenarios
|
|
--------------------
|
|
* SSD Journal Disks - Use SSD journal disks validate proper configuration on storage lab
|
|
* Peer Groups - Provision system with up to 8 (replication 2) and 9 (replication 3) storage hosts
|
|
* OSD disk replacement - Validate OSD disk replacement procedure
|
|
|
|
Backup and restore scenarios
|
|
----------------------------
|
|
* B&R - AIO-SX
|
|
* B&R - AIO-DX
|
|
* B&R - Standard 2C+2W
|
|
* B&R - Storage 2C+2S+2W
|
|
|
|
Documentation Impact
|
|
====================
|
|
|
|
The changes to be made shouldn't interfere with system usage. At this time,
|
|
there is expected to be no documentation changes required.
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] https://docs.ceph.com/en/latest/releases/nautilus/#upgrading-from-mimic-or-luminous
|
|
.. [2] https://docs.ceph.com/en/latest/releases/pacific/#upgrade-from-pre-nautilus-releases-like-mimic-or-luminous
|
|
.. [3] https://ceph.io/en/news/blog/2017/new-luminous-bluestore/
|
|
.. [4] https://docs.ceph.com/en/latest/ceph-volume/intro/#ceph-disk-replaced
|