Merge "Adding spec: Containerized CEPH deployment and provision"
This commit is contained in:
commit
0c42c207e2
|
@ -0,0 +1,384 @@
|
|||
Containerized Ceph deployment and provision
|
||||
===========================================
|
||||
|
||||
Storyboard:
|
||||
https://storyboard.openstack.org/#!/story/2005527
|
||||
|
||||
Slice: https://docs.google.com/presentation/d/1VcolrSux-sEBUYcQA06yrEeYx4KM4Ne5
|
||||
wryeYBmSy_o/edit#slide=id.p
|
||||
|
||||
Design doc: https://docs.google.com/document/d/1lnAZSu4vAD4EB62Mk18mCgM7aAItl26
|
||||
sJa1GBf9DJz4/edit?usp=sharing
|
||||
|
||||
Ceph is the standard persistent storage backend for StarlingX, this story is to
|
||||
implement Ceph containerization.
|
||||
|
||||
The implementation of containerized Ceph include:
|
||||
|
||||
* Ceph distributed cluster deployment on containers
|
||||
* Persistent volume provisioners: rbd (ROX: readonlymany, RWO: readwriteonce)
|
||||
and Cephfs (RWX: readwritemany)
|
||||
|
||||
There are some benefits of containerized Ceph:
|
||||
|
||||
* Ceph version upgrade: Dockerfile to build image for Ceph independently or
|
||||
just use upstream image, no need to adapt to the StarlingX build system,
|
||||
simple to upgrade new release.
|
||||
* Deployment: Autonomous management for Ceph services (such as Ceph-mon,
|
||||
Ceph-osd, Ceph-mgr etc) by COs (container orchestrator systems), container
|
||||
isolate namespace and avoid resource conflicts.
|
||||
* Autoscaling: Flexible and elastic expansion of Ceph cluster.
|
||||
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Kubernetes application (e,g, OpenStack) require access to persistent storage,
|
||||
current solution is a helm chart that leverage the rbd-provisioner incubator
|
||||
project: external-storage, but has some problems.
|
||||
|
||||
There are several provisioners for containerized Ceph:
|
||||
|
||||
* In-tree: Implemented in Kubernetes upstream, but it is not used anymore and
|
||||
code freezed.
|
||||
* external-storage: https://github.com/kubernetes-incubator/external-storage,
|
||||
the project used to extend in-tree provisioners before CSI existed, will
|
||||
slowly deprecated.
|
||||
* Ceph-CSI: https://github.com/Ceph/Ceph-CSI, CSI (container storage interface)
|
||||
is the standard interface that COs (container orchestration systems) can use
|
||||
to expose arbitary storage systems, also work with other COs such as docker
|
||||
swarm and mesos etc.
|
||||
|
||||
So it is the best way to implement Ceph-CSI rbd provisioners to replace current
|
||||
rbd-provisioner, also add Ceph-CSI Cephfs provisioner to support readwritemany
|
||||
persistent volume.
|
||||
|
||||
We found several difficulties without Ceph containerization:
|
||||
|
||||
1. A stable version of Ceph is released every 9 months, for example, StarlingX
|
||||
community plan to deploy Ceph N+1 in next release, but the process of Ceph
|
||||
version upgrade is very complicated, we have to adapt Ceph upstream codebase to
|
||||
StarlingX build environment, including several submodules with specific commit
|
||||
Id and with build issues handling, those efforts takes up most of upgrade time.
|
||||
|
||||
2. The Ceph related plugin modules in StarlingX need to be refactored to
|
||||
accommodate the features of Ceph, making Ceph deployment and the StarlingX
|
||||
environment coupled together, which increase difficulty in troubleshooting.
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
The deployer/developer needs to manage Ceph deployment and provision:
|
||||
|
||||
* Build Ceph/Ceph-CSI docker image with required version number.
|
||||
* Bring up Ceph cluster when controller nodes are available.
|
||||
* Dynamic adjust configurations after Ceph deploy completed.
|
||||
* Kubernetes applications need access persistent volume with RWO/RWX/ROX modes.
|
||||
* Setup Ceph client with libraries for block, object, filesystem access for
|
||||
external client (OpenStack Cinder/Nova etc) with languages (python, go, etc).
|
||||
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
This story will have a solution proposal for Ceph containerization.
|
||||
|
||||
Solution
|
||||
----------------
|
||||
|
||||
There are 2 solutions for Ceph containerization:
|
||||
|
||||
1. OpenStack-Helm-Infra is simple for code organization, and easy for Ceph
|
||||
upgrade.
|
||||
|
||||
2. Rook is more complicated, support more features include Ceph-CSI, and better
|
||||
scalability, Rook also has some attention in Ceph community, but currently Ceph
|
||||
upgrade need to do more work currently.
|
||||
|
||||
This proposal is for Rook, it is a more ideal choice after v1.0 which support
|
||||
Ceph-CSI.
|
||||
|
||||
Advantages:
|
||||
|
||||
* Rook supports Ceph storage backend natively, it turns storage software into
|
||||
self-managing, self-scaling and self-healing storage services via Rook
|
||||
operator, no need work for high available support.
|
||||
* Good scalability, other storage backends (e,g, NFS, EdgeFS) can also be
|
||||
supported.
|
||||
* Rook community has more fork/star/watch compared with OpenStack-Helm-Infra
|
||||
project, and current Ceph supported by Rook is stable, Rook also support Ceph
|
||||
operator helm chart.
|
||||
|
||||
Disadvantages:
|
||||
|
||||
* From Ceph upgrade aspect, need manually check cluster status before, but
|
||||
upgrade operator will support in future which is much more simple.
|
||||
* Rook framework is popular but also has golang codebase, increase the cost of
|
||||
maintenance, and status of some Rook projects are not stable yet.
|
||||
|
||||
Implementation:
|
||||
----------------
|
||||
|
||||
Rook community:
|
||||
|
||||
* Rook current release is v1.0 in early June, 2019.
|
||||
* Rook supports Ceph with stable also add Ceph Nautilus and Ceph-CSI plugin
|
||||
experimental support in v1.0, and plan to stable support for CSI plugin in
|
||||
v1.1 due by August 16th, 2019.
|
||||
* Rook also supports helm charts, but only for operator without CSI support,
|
||||
you also need to create Ceph cluster by kubectl shown in Rook Ceph
|
||||
quickstart.
|
||||
* Rook plan to make more complete upgrade automation in future.
|
||||
|
||||
Code changes:
|
||||
|
||||
* Remove current helm chart which leverage the rbd-provisioner incubator
|
||||
project (external-storage as we previously mentioned).
|
||||
* Remove service manager functionality for high available support with Ceph
|
||||
services.
|
||||
* Remove native Ceph cluster bootstrap in apply-bootstrap-manifest of ansible
|
||||
configuration.
|
||||
* Introduce upstream project: https://github.com/Rook/Rook with the cloud-native
|
||||
storage orchestrator for kubernetes with Ceph-CSI support.
|
||||
* Create new Dockerfiles to build container image for Rook-operator (include
|
||||
Rook operator and agent) and Ceph daemon (include Ceph-mon, Ceph-osd,
|
||||
Ceph-mgr, etc)
|
||||
* Add provisioning of Rook operator and Ceph cluster post ansible-apply, and
|
||||
consider implement in platform application when the rbd-provisioner chart is
|
||||
removed from this application.
|
||||
* Add 2 helm chart as Rook charts: Rook-operator (Ceph-CSI support) and
|
||||
Rook-Ceph (cluster), and consider as platform app post ansible-apply, but
|
||||
need addition work, because:
|
||||
Firstly, Rook 1.0 have helm chart but no CSI support currently.
|
||||
Secondly, Current Rook helm chart only for Rook operator, not include Ceph
|
||||
cluster bring up.
|
||||
* Changes in Rook & Ceph plugins and sysinv impact code.
|
||||
|
||||
The Rook & Ceph plugins and sysinv impact implementation include:
|
||||
|
||||
* Remove puppet for which encapsulate puppet operations for Ceph storage
|
||||
configuration.
|
||||
* Remove python-cephclient module, the operations of ceph monitor and osd will
|
||||
manage by Rook operator, and replaced by python-rookclient.
|
||||
* Add support for Rook and Ceph config with different system configurations,
|
||||
like get_overrides() in Openstack-Helm charts.
|
||||
* Add python-rookclient to operate several deployment option by override yaml
|
||||
files or helm charts, for example: Ceph-monitor replication:3, since operator
|
||||
will create new if we manually delete a monitor, it cannot implement only by
|
||||
restful api to remove monitor.
|
||||
In current implementation, the configuration of system-mode will transfer to
|
||||
servicemanager to bring up Ceph cluster in native deployment, and it will
|
||||
reflect some option such as replication of ceph monitors, and in Rook
|
||||
configuration there is also corresponding parameters.
|
||||
Rook operator can refresh the ceph cluster when check the configuration
|
||||
changed by override the yaml/helm charts and no need to update sysinv code
|
||||
for RESTful commands.
|
||||
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Solution: OpenStack-Helm-Infra
|
||||
|
||||
* Introduce project: https://github.com/openstack/openstack-helm-infra with Ceph
|
||||
related helm charts: Ceph-client, Ceph-mon, Ceph-osd etc.
|
||||
* Helm/armada has been widely accepted and used by the community, this solution
|
||||
follows the helm architecture and less changes to related code (e,g, helm
|
||||
plugin for Ceph manifest).
|
||||
* Ceph version upgrade is easy via helm install/update, but need to port
|
||||
Ceph-CSI project and rework the new Ceph-provisioners.
|
||||
* Need additional work for the high available support for Ceph, like the
|
||||
function of service manager in StarlingX.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
1. In bootstrap, Init script to deploy Rook operator, when Rook operator bring
|
||||
up Ceph cluster, there is Rook-plugin to override Rook-Ceph yaml with system
|
||||
configration for monitor and osd settings etc.
|
||||
|
||||
2. Rookclient provide interface to change the deployment option by overriding
|
||||
rook yaml files (in future by helm charts), also include show & dump wrapper
|
||||
interface which used by sysinv.
|
||||
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
None
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
Suppose no impact. For networking, Ceph and related container uses host native
|
||||
network.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
* Containerized Ceph deployment should be used instead of native Ceph deployed
|
||||
by puppet.
|
||||
* Containerized Ceph-CSI provision should be used instead of 'rbd-provisioner'.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
None
|
||||
|
||||
Upgrade impact
|
||||
--------------
|
||||
|
||||
1. Upgrade after Rook:
|
||||
|
||||
Upgrade work includes 2 parts: Rook-Ceph operator and Ceph cluster. There is
|
||||
manual for update in Rook website:
|
||||
https://Rook.io/docs/Rook/v1.0/Ceph-upgrade.html, actually it need additional
|
||||
work.
|
||||
|
||||
Although in Rook community there has plan to make more complete upgrade
|
||||
automation, currently we have to follow the manual to upgrade.
|
||||
|
||||
2. Upgrade from current implementation to Rook:
|
||||
|
||||
There is a big gap to upgrade from current native Ceph cluster to Rook, because
|
||||
the deploy model changed completely, it is hard to follow official upgrade
|
||||
manual to replace the Ceph services (mon, osd, etc) step by step, and Rook
|
||||
operator does not support bring up integrated service with native and
|
||||
containerized types.
|
||||
|
||||
Although the upgrade is not recommended, if necessary, following are key
|
||||
checkpoints, and I will create script for the actions.
|
||||
|
||||
* Teardown old native Ceph cluster, and keep data in storage(osd).
|
||||
* Ceph osd can be set as the same device or file in Rook and native, keep the
|
||||
storage setting be consistent.
|
||||
* After that, re-deploy StarlingX with Rook with updated ceph-cluser.yaml, and
|
||||
bring up new cluster.
|
||||
|
||||
Upgrade a Rook cluster is not without risk, especially upgrade from native to
|
||||
rook. There may be unexpected issues or obstacles that damage the integrity and
|
||||
health of your storage cluster, including data loss.
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
Tingjie Chen <tingjie.chen@intel.com>
|
||||
|
||||
Other contributors: (Provide key comments and integrated into this spec)
|
||||
Brent Rowsell <brent.rowsell@windriver.com>,
|
||||
Bob Church <robert.church@windriver.com>
|
||||
|
||||
Repos Impacted
|
||||
--------------
|
||||
|
||||
stx-manifest, stx-tools, stx-config, stx-integ, stx-upstream,
|
||||
stx-ansible-playbooks
|
||||
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
We can consider containerized Ceph as another storage backend used by OpenStack
|
||||
and Rook Ceph cluster can also bring up and co-exist with current native Ceph
|
||||
cluster.
|
||||
|
||||
For the develop model, add this as a parallel capability and keep the existing
|
||||
Ceph implementation. Default would be the existing implementation with the
|
||||
ability to override to use this implementation at deployment time. The old
|
||||
implementation would be removed in STX4.0 and include an upgrade strategy from
|
||||
the legacy implementation.
|
||||
|
||||
The main implementation can merge in advance since it doesn't break current
|
||||
functionality, in the meantime we can prepare patch which switch the default
|
||||
Ceph backend into containerized Ceph implementation (include some configuration
|
||||
changes), and enable it in certain branch (the change is tiny) and then merge
|
||||
into master if ready.
|
||||
|
||||
The implementation can be split into the following milstones:
|
||||
|
||||
MS1 (end of July): Bring up Rook operator and Ceph cluster in StarlingX.
|
||||
|
||||
MS2 (30th, Sep): Finished Rook plugins and sysinv impact code, deploy by system
|
||||
config policy.
|
||||
|
||||
MS3 (20th, Oct): Ceph-CSI (for Kubernetes app) and OpenStack service support.
|
||||
|
||||
Once MS3 is achieved, the basic functionality needed to make sure we can cover
|
||||
all operational scenarios triggered through the sysinv API and get a feel for
|
||||
system impacts.
|
||||
|
||||
Include but not limit to:
|
||||
|
||||
* Monitor assignment/reassignment.
|
||||
* Adding/removing storage tiers (impacts Ceph crushmap)
|
||||
* Defining kubernetes default storage class(current handled in rbd-provisioner)
|
||||
* Host-delete/host-add for host that do/will deploy Rook resources.
|
||||
* Replication factor updates for min data redundancy vs. H/A data availability
|
||||
(AIO-SX disk based vs host based replication)
|
||||
|
||||
Also there are test cases defined for Ceph:
|
||||
https://ethercalc.openstack.org/orb83xruwmo8
|
||||
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
Story: [Feature] Kubernetes Platform Support at
|
||||
https://storyboard.openstack.org/#!/story/2002843
|
||||
|
||||
Story: [Feature] Ceph persistent storage backend for Kubernetes
|
||||
https://storyboard.openstack.org/#!/story/2002844
|
||||
|
||||
This requires existing functionality from some projects that are not
|
||||
currently used by StarlingX:
|
||||
|
||||
* docker
|
||||
* kubernetes
|
||||
* Rook
|
||||
* Openstack-Helm-Infra
|
||||
* Ceph-CSI
|
||||
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
None
|
||||
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
None
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
Rook: https://github.com/Rook/Rook
|
||||
Ceph-CSI: https://github.com/Ceph/Ceph-CSI
|
||||
Openstack-Helm: https://github.com/openstack/openstack-helm
|
||||
Openstack-Helm-Infra: https://github.com/openstack/openstack-helm-infra
|
||||
build wiki: https://wiki.openstack.org/wiki/StarlingX/Containers/BuildingImages
|
||||
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
None
|
Loading…
Reference in New Issue