Merge "Adding spec: Containerized CEPH deployment and provision"

This commit is contained in:
Zuul 2019-08-28 14:47:35 +00:00 committed by Gerrit Code Review
commit 0c42c207e2
1 changed files with 384 additions and 0 deletions

View File

@ -0,0 +1,384 @@
Containerized Ceph deployment and provision
===========================================
Storyboard:
https://storyboard.openstack.org/#!/story/2005527
Slice: https://docs.google.com/presentation/d/1VcolrSux-sEBUYcQA06yrEeYx4KM4Ne5
wryeYBmSy_o/edit#slide=id.p
Design doc: https://docs.google.com/document/d/1lnAZSu4vAD4EB62Mk18mCgM7aAItl26
sJa1GBf9DJz4/edit?usp=sharing
Ceph is the standard persistent storage backend for StarlingX, this story is to
implement Ceph containerization.
The implementation of containerized Ceph include:
* Ceph distributed cluster deployment on containers
* Persistent volume provisioners: rbd (ROX: readonlymany, RWO: readwriteonce)
and Cephfs (RWX: readwritemany)
There are some benefits of containerized Ceph:
* Ceph version upgrade: Dockerfile to build image for Ceph independently or
just use upstream image, no need to adapt to the StarlingX build system,
simple to upgrade new release.
* Deployment: Autonomous management for Ceph services (such as Ceph-mon,
Ceph-osd, Ceph-mgr etc) by COs (container orchestrator systems), container
isolate namespace and avoid resource conflicts.
* Autoscaling: Flexible and elastic expansion of Ceph cluster.
Problem description
===================
Kubernetes application (e,g, OpenStack) require access to persistent storage,
current solution is a helm chart that leverage the rbd-provisioner incubator
project: external-storage, but has some problems.
There are several provisioners for containerized Ceph:
* In-tree: Implemented in Kubernetes upstream, but it is not used anymore and
code freezed.
* external-storage: https://github.com/kubernetes-incubator/external-storage,
the project used to extend in-tree provisioners before CSI existed, will
slowly deprecated.
* Ceph-CSI: https://github.com/Ceph/Ceph-CSI, CSI (container storage interface)
is the standard interface that COs (container orchestration systems) can use
to expose arbitary storage systems, also work with other COs such as docker
swarm and mesos etc.
So it is the best way to implement Ceph-CSI rbd provisioners to replace current
rbd-provisioner, also add Ceph-CSI Cephfs provisioner to support readwritemany
persistent volume.
We found several difficulties without Ceph containerization:
1. A stable version of Ceph is released every 9 months, for example, StarlingX
community plan to deploy Ceph N+1 in next release, but the process of Ceph
version upgrade is very complicated, we have to adapt Ceph upstream codebase to
StarlingX build environment, including several submodules with specific commit
Id and with build issues handling, those efforts takes up most of upgrade time.
2. The Ceph related plugin modules in StarlingX need to be refactored to
accommodate the features of Ceph, making Ceph deployment and the StarlingX
environment coupled together, which increase difficulty in troubleshooting.
Use Cases
---------
The deployer/developer needs to manage Ceph deployment and provision:
* Build Ceph/Ceph-CSI docker image with required version number.
* Bring up Ceph cluster when controller nodes are available.
* Dynamic adjust configurations after Ceph deploy completed.
* Kubernetes applications need access persistent volume with RWO/RWX/ROX modes.
* Setup Ceph client with libraries for block, object, filesystem access for
external client (OpenStack Cinder/Nova etc) with languages (python, go, etc).
Proposed change
===============
This story will have a solution proposal for Ceph containerization.
Solution
----------------
There are 2 solutions for Ceph containerization:
1. OpenStack-Helm-Infra is simple for code organization, and easy for Ceph
upgrade.
2. Rook is more complicated, support more features include Ceph-CSI, and better
scalability, Rook also has some attention in Ceph community, but currently Ceph
upgrade need to do more work currently.
This proposal is for Rook, it is a more ideal choice after v1.0 which support
Ceph-CSI.
Advantages:
* Rook supports Ceph storage backend natively, it turns storage software into
self-managing, self-scaling and self-healing storage services via Rook
operator, no need work for high available support.
* Good scalability, other storage backends (e,g, NFS, EdgeFS) can also be
supported.
* Rook community has more fork/star/watch compared with OpenStack-Helm-Infra
project, and current Ceph supported by Rook is stable, Rook also support Ceph
operator helm chart.
Disadvantages:
* From Ceph upgrade aspect, need manually check cluster status before, but
upgrade operator will support in future which is much more simple.
* Rook framework is popular but also has golang codebase, increase the cost of
maintenance, and status of some Rook projects are not stable yet.
Implementation:
----------------
Rook community:
* Rook current release is v1.0 in early June, 2019.
* Rook supports Ceph with stable also add Ceph Nautilus and Ceph-CSI plugin
experimental support in v1.0, and plan to stable support for CSI plugin in
v1.1 due by August 16th, 2019.
* Rook also supports helm charts, but only for operator without CSI support,
you also need to create Ceph cluster by kubectl shown in Rook Ceph
quickstart.
* Rook plan to make more complete upgrade automation in future.
Code changes:
* Remove current helm chart which leverage the rbd-provisioner incubator
project (external-storage as we previously mentioned).
* Remove service manager functionality for high available support with Ceph
services.
* Remove native Ceph cluster bootstrap in apply-bootstrap-manifest of ansible
configuration.
* Introduce upstream project: https://github.com/Rook/Rook with the cloud-native
storage orchestrator for kubernetes with Ceph-CSI support.
* Create new Dockerfiles to build container image for Rook-operator (include
Rook operator and agent) and Ceph daemon (include Ceph-mon, Ceph-osd,
Ceph-mgr, etc)
* Add provisioning of Rook operator and Ceph cluster post ansible-apply, and
consider implement in platform application when the rbd-provisioner chart is
removed from this application.
* Add 2 helm chart as Rook charts: Rook-operator (Ceph-CSI support) and
Rook-Ceph (cluster), and consider as platform app post ansible-apply, but
need addition work, because:
Firstly, Rook 1.0 have helm chart but no CSI support currently.
Secondly, Current Rook helm chart only for Rook operator, not include Ceph
cluster bring up.
* Changes in Rook & Ceph plugins and sysinv impact code.
The Rook & Ceph plugins and sysinv impact implementation include:
* Remove puppet for which encapsulate puppet operations for Ceph storage
configuration.
* Remove python-cephclient module, the operations of ceph monitor and osd will
manage by Rook operator, and replaced by python-rookclient.
* Add support for Rook and Ceph config with different system configurations,
like get_overrides() in Openstack-Helm charts.
* Add python-rookclient to operate several deployment option by override yaml
files or helm charts, for example: Ceph-monitor replication:3, since operator
will create new if we manually delete a monitor, it cannot implement only by
restful api to remove monitor.
In current implementation, the configuration of system-mode will transfer to
servicemanager to bring up Ceph cluster in native deployment, and it will
reflect some option such as replication of ceph monitors, and in Rook
configuration there is also corresponding parameters.
Rook operator can refresh the ceph cluster when check the configuration
changed by override the yaml/helm charts and no need to update sysinv code
for RESTful commands.
Alternatives
------------
Solution: OpenStack-Helm-Infra
* Introduce project: https://github.com/openstack/openstack-helm-infra with Ceph
related helm charts: Ceph-client, Ceph-mon, Ceph-osd etc.
* Helm/armada has been widely accepted and used by the community, this solution
follows the helm architecture and less changes to related code (e,g, helm
plugin for Ceph manifest).
* Ceph version upgrade is easy via helm install/update, but need to port
Ceph-CSI project and rework the new Ceph-provisioners.
* Need additional work for the high available support for Ceph, like the
function of service manager in StarlingX.
Data model impact
-----------------
1. In bootstrap, Init script to deploy Rook operator, when Rook operator bring
up Ceph cluster, there is Rook-plugin to override Rook-Ceph yaml with system
configration for monitor and osd settings etc.
2. Rookclient provide interface to change the deployment option by overriding
rook yaml files (in future by helm charts), also include show & dump wrapper
interface which used by sysinv.
REST API impact
---------------
None
Security impact
---------------
None
Other end user impact
---------------------
None
Performance Impact
------------------
Suppose no impact. For networking, Ceph and related container uses host native
network.
Other deployer impact
---------------------
* Containerized Ceph deployment should be used instead of native Ceph deployed
by puppet.
* Containerized Ceph-CSI provision should be used instead of 'rbd-provisioner'.
Developer impact
----------------
None
Upgrade impact
--------------
1. Upgrade after Rook:
Upgrade work includes 2 parts: Rook-Ceph operator and Ceph cluster. There is
manual for update in Rook website:
https://Rook.io/docs/Rook/v1.0/Ceph-upgrade.html, actually it need additional
work.
Although in Rook community there has plan to make more complete upgrade
automation, currently we have to follow the manual to upgrade.
2. Upgrade from current implementation to Rook:
There is a big gap to upgrade from current native Ceph cluster to Rook, because
the deploy model changed completely, it is hard to follow official upgrade
manual to replace the Ceph services (mon, osd, etc) step by step, and Rook
operator does not support bring up integrated service with native and
containerized types.
Although the upgrade is not recommended, if necessary, following are key
checkpoints, and I will create script for the actions.
* Teardown old native Ceph cluster, and keep data in storage(osd).
* Ceph osd can be set as the same device or file in Rook and native, keep the
storage setting be consistent.
* After that, re-deploy StarlingX with Rook with updated ceph-cluser.yaml, and
bring up new cluster.
Upgrade a Rook cluster is not without risk, especially upgrade from native to
rook. There may be unexpected issues or obstacles that damage the integrity and
health of your storage cluster, including data loss.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
Tingjie Chen <tingjie.chen@intel.com>
Other contributors: (Provide key comments and integrated into this spec)
Brent Rowsell <brent.rowsell@windriver.com>,
Bob Church <robert.church@windriver.com>
Repos Impacted
--------------
stx-manifest, stx-tools, stx-config, stx-integ, stx-upstream,
stx-ansible-playbooks
Work Items
----------
We can consider containerized Ceph as another storage backend used by OpenStack
and Rook Ceph cluster can also bring up and co-exist with current native Ceph
cluster.
For the develop model, add this as a parallel capability and keep the existing
Ceph implementation. Default would be the existing implementation with the
ability to override to use this implementation at deployment time. The old
implementation would be removed in STX4.0 and include an upgrade strategy from
the legacy implementation.
The main implementation can merge in advance since it doesn't break current
functionality, in the meantime we can prepare patch which switch the default
Ceph backend into containerized Ceph implementation (include some configuration
changes), and enable it in certain branch (the change is tiny) and then merge
into master if ready.
The implementation can be split into the following milstones:
MS1 (end of July): Bring up Rook operator and Ceph cluster in StarlingX.
MS2 (30th, Sep): Finished Rook plugins and sysinv impact code, deploy by system
config policy.
MS3 (20th, Oct): Ceph-CSI (for Kubernetes app) and OpenStack service support.
Once MS3 is achieved, the basic functionality needed to make sure we can cover
all operational scenarios triggered through the sysinv API and get a feel for
system impacts.
Include but not limit to:
* Monitor assignment/reassignment.
* Adding/removing storage tiers (impacts Ceph crushmap)
* Defining kubernetes default storage class(current handled in rbd-provisioner)
* Host-delete/host-add for host that do/will deploy Rook resources.
* Replication factor updates for min data redundancy vs. H/A data availability
(AIO-SX disk based vs host based replication)
Also there are test cases defined for Ceph:
https://ethercalc.openstack.org/orb83xruwmo8
Dependencies
============
Story: [Feature] Kubernetes Platform Support at
https://storyboard.openstack.org/#!/story/2002843
Story: [Feature] Ceph persistent storage backend for Kubernetes
https://storyboard.openstack.org/#!/story/2002844
This requires existing functionality from some projects that are not
currently used by StarlingX:
* docker
* kubernetes
* Rook
* Openstack-Helm-Infra
* Ceph-CSI
Testing
=======
None
Documentation Impact
====================
None
References
==========
Rook: https://github.com/Rook/Rook
Ceph-CSI: https://github.com/Ceph/Ceph-CSI
Openstack-Helm: https://github.com/openstack/openstack-helm
Openstack-Helm-Infra: https://github.com/openstack/openstack-helm-infra
build wiki: https://wiki.openstack.org/wiki/StarlingX/Containers/BuildingImages
History
=======
None