stx-7.0: initial spec for Debian builds on K8s
Initial spec for adding support for full K8s to debian build tools. Story: 2009812 Task: 44374 Signed-off-by: Davlet Panech <davlet.panech@windriver.com> Change-Id: I3e640b8c9a14592db8924e893488a908770a7bdd
This commit is contained in:
parent
26bcbd6b2c
commit
960766e114
|
@ -0,0 +1,390 @@
|
|||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License. http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
======================================
|
||||
StarlingX: Debian Builds on Kubernetes
|
||||
======================================
|
||||
|
||||
Storyboard Story:
|
||||
https://storyboard.openstack.org/#!/story/2009812
|
||||
|
||||
The new Debian build system [1] in conjunction with Minikube lacks support for
|
||||
multiple projects, branches & users within the same environment. We propose a
|
||||
Kubernetes infrastructure to remedy these shortcomings: a dedicated multi-node
|
||||
build cluster with shared services, as well as the necessary tooling changes.
|
||||
|
||||
Problem Description
|
||||
===================
|
||||
|
||||
The current implementation relies on Minikube – a version of Kubernetes
|
||||
optimized for a single-node, single user operation -- making it difficult to
|
||||
share computing resources between multiple projects, branches, and users
|
||||
within the same environment, particularly on a dedicated “daily” build server.
|
||||
The Debian package repository service cannot be shared, which results in
|
||||
excessive download times and disk usage.
|
||||
|
||||
There is no explicit support for CI environments, requiring additional
|
||||
scripting in Jenkins or similar tools. Jenkins’s approach to k8s integration
|
||||
is not compatible with the current tooling, as it requires the top-level
|
||||
scripts written in the “pipeline” domain-specific language. The best we can do
|
||||
in Jenkins is call the StarlingX build scripts, bypassing Jenkins’ POD & node
|
||||
scheduling & management mechanisms.
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
This change would support infrastructure configurations covering the common
|
||||
use cases described below.
|
||||
|
||||
Isolated single-user builds
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
An individual contributor wants to build individual packages and the
|
||||
installation ISO; or docker images – in an isolated, autonomous environment.
|
||||
This use case is already supported by the build tool using Minikube, any
|
||||
further changes must remain compatible with this type of environment.
|
||||
|
||||
Daily build server
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
|
||||
An organization wishes to maintain a server cluster for building multiple
|
||||
projects or branches daily or on demand (this is the case with the current
|
||||
StarlingX official build system). Tooling must support:
|
||||
|
||||
* Kubernetes clusters. Motivation: some organizations already have
|
||||
Kubernetes clusters.
|
||||
* StarlingX clusters. Motivation: “eat our own dog food”
|
||||
* Multiple worker nodes. Motivation: allow for expanding computing resources
|
||||
available to the build system.
|
||||
* Ideally, clusters without a shared file system. Motivation: shared redundant
|
||||
file systems are slow and difficult to implement and may not be available in
|
||||
the target environment.
|
||||
|
||||
Build server open to individuals
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
This is a variation of the above, but with the option for individual
|
||||
contributors to generate private builds based on their patches before pushing
|
||||
them to source control. Motivation: this allows users to benefit from the more
|
||||
powerful, centralized build server.
|
||||
|
||||
This use case is not addressed by the current spec. We believe the proposed
|
||||
changes are sufficient to add this functionality in the future.
|
||||
|
||||
Proposed changes
|
||||
================
|
||||
|
||||
We propose a build system that can run in any environment based on Kubernetes,
|
||||
and a matching installation to drive daily builds on CENGN.
|
||||
|
||||
Change the build scripts to support vanilla k8s multi-user environments. This
|
||||
includes making sure POD and directory names do not clash between users or
|
||||
multiple projects/branches. Motivation: allow multiple users & projects in the
|
||||
same environment.
|
||||
|
||||
Update helm charts to isolate the common parts between minikube and other k8s
|
||||
environments.
|
||||
|
||||
Update the ``stx`` tool as it may be of limited or no use in full k8s
|
||||
environments.
|
||||
|
||||
Replace Aptly with Pulp (package repository service). Motivation: Pulp
|
||||
supports file types other than Debian packages, such as source archives used
|
||||
as build inputs.
|
||||
|
||||
Update the package repository service container so that it can be shared among
|
||||
multiple builds. Motivation: avoid unnecessary duplication of package files
|
||||
that can be shared among different users on the same system.
|
||||
|
||||
Update other build containers to allow transient use (single command
|
||||
execution). Motivation: efficient memory/CPU sharing among multiple builds.
|
||||
::
|
||||
|
||||
xxxxxx Kubernetes or Minikube xxxxxxxxxxxxxxxxxxxxxxxx
|
||||
x ┌──────────────┐ x
|
||||
x │ User builder │ x
|
||||
x ┌──────────────┐ ┌──┤ PODs │ x User 1
|
||||
x │ Pulp │◄─────┐ │ └──────────────┘ x
|
||||
x └──────────────┘ │ │ x
|
||||
x │ │ ┌──────────────┐ x
|
||||
x ┌──────────────┐ │ │ │ User builder │ x
|
||||
x │ Other repos │◄─────┼────┼──┤ PODs │ x User 2
|
||||
x └──────────────┘ │ │ └──────────────┘ x
|
||||
x │ │ x
|
||||
x ┌──────────────┐ │ │ ┌──────────────┐ x
|
||||
x │ Docker reg │◄─────┘ │ │ User builder │ x
|
||||
x └──────────────┘ └──┤ PODs │ x User 2
|
||||
x └──────────────┘ x
|
||||
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
||||
|
||||
|
||||
Additional repository services may be deployed in the cluster to support
|
||||
specific types of data. Whether the build will require additional repository
|
||||
services remains to be seen.
|
||||
|
||||
A docker registry may be deployed for managing intermediate containers used by
|
||||
the build. Some environments may have a docker registry available outside of
|
||||
the cluster, so this is optional. In particular, CENGN already has this service
|
||||
(Docker Hub) available.
|
||||
|
||||
We propose installing Kubernetes on a single server to drive daily builds
|
||||
(CENGN). Kubernetes will be configured to allow the addition of additional
|
||||
nodes. Jenkins will be installed to trigger Tekton [2] builds and for
|
||||
reporting.
|
||||
|
||||
Tekton is a CI pipeline engine designed specifically for Kubernetes. It is
|
||||
command-line driven and may be used by the build tools directly to schedule
|
||||
build jobs within the k8s cluster. Whether such direct usage is feasible or
|
||||
useful is unclear at this point.
|
||||
|
||||
Outputs of released or otherwise important builds would need to be saved
|
||||
indefinitely and backed up in case of hardware failures. On CENGN
|
||||
availability of backup storage is to be determined.
|
||||
|
||||
Outputs of old non-released builds would be deleted regularly (builds older
|
||||
than 2 weeks or similar). This includes all artifacts (log files, deb files,
|
||||
ISOs).
|
||||
|
||||
Mirrors of 3rd-party files (tars, deb files) would be saved indefinitely.
|
||||
|
||||
Docker images would be built using kaniko [4] -- a tool to build container
|
||||
images from a Dockerfile, inside a container or Kubernetes cluster. It allows
|
||||
one to run "docker build" inside a docker container. This method is
|
||||
appropriate for building Debian build tools images.
|
||||
|
||||
For the more complicated cases that need to acces docker in other ways, we
|
||||
would use sysbox [5] -- a tool for running system software, including Docker,
|
||||
inside docker containers. This method is appropriate for building application
|
||||
images, such as Openstack containers.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Tekton
|
||||
^^^^^^
|
||||
|
||||
We do not have to use Tekton – we could simply run build commands directly in
|
||||
k8s PODs controlled by the build scripts (Python), with a Jenkins on top to
|
||||
manage build schedules and artifacts archiving. This would require us to
|
||||
maintain a sizable chunk of the pipeline logic in Jenkins. Jenkins is hard to
|
||||
install and automate, making the testing of updates to the pipelines a
|
||||
challenge. Jenkins’ automation API is somewhat unstable and uses an obscure
|
||||
pipeline definition language. We expect a Tekton-based approach to be largely
|
||||
free of these shortcomings.
|
||||
|
||||
On the other hand, Tekton is not as mature as Jenkins.
|
||||
|
||||
Docker image builds
|
||||
^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
To build docker images in k8s and instead of kaniko & sysbox, we could use
|
||||
docker-in-docker [6]. This method has multiple problems linked to kernel
|
||||
security and I/O performance [7].
|
||||
|
||||
We could also mount the hosts' daemon socket inside any containers/pods that
|
||||
need to interact with docker. This would leave container instances behind on
|
||||
the host and would require additional scripting to clean them up.
|
||||
|
||||
Impact on build tools installations
|
||||
-----------------------------------
|
||||
|
||||
Individual contributors will be able to continue using Minikube, as they do
|
||||
now.
|
||||
|
||||
Installing and configuring Kubernetes itself is beyond the scope of this
|
||||
document. The services & POD definitions used by the build tools shall be
|
||||
reusable (as Helm charts) no matter what the surrounding infrastructure looks
|
||||
like.
|
||||
|
||||
Open questions
|
||||
--------------
|
||||
|
||||
Persistent storage
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The builds would need to persist these types of files:
|
||||
|
||||
* Debian packages and other files (tarballs, etc) used as build inputs. This
|
||||
will be handled by Pulp, whose underlying storage facility is to be
|
||||
determined.
|
||||
* Debian packages produced by the build. This will be handled by Pulp as well.
|
||||
* Debian package mirror. This may be handled by Pulp as well. It is
|
||||
currently implemented as a custom script on CENGN, outside of k8s.
|
||||
* Other files produced by the build (ISO files, docker image list files). We
|
||||
expect to use Pulp for this as well.
|
||||
* Log files are normally stored within k8s itself, as well as in individual
|
||||
POD cotainers. We would probably need to export them for ease of access.
|
||||
CENGN users would expect log files as simple downloadable files, since we
|
||||
not proposing making anu k8s GUIs available to the public at this point.
|
||||
ElasticSearch may be helpful (searchable database of logs, among other
|
||||
things), but it needs a lot of CPU & RAM.
|
||||
* Docker images. Official images (ie build outputs) are to be published to
|
||||
an external Docker registry (Docker Hub).
|
||||
|
||||
It is not clear whether the build would require a shared persistent file
|
||||
system (eg for passing build artifacts between build steps). It is
|
||||
difficult to implement and target k8s installations may not have one
|
||||
available for use by us. Without a shared file system builds will take
|
||||
longer to complete due to having to download and copy many files.
|
||||
Contrast this with the older CentOS build system, which relies on a shared
|
||||
file system and uses symbolic links for file sharing.
|
||||
|
||||
If a file system can't be shared - as a workaround - all builds' PODs
|
||||
will have to be scheduled to run on the same node.
|
||||
Downside: can't schedule PODs on different nodes
|
||||
|
||||
An object storage service (non-shared, artifacts to be copied, no symlinks,
|
||||
etc, such as MinIO[3]) may be used for artifacts archiving, as well as for
|
||||
passing artifacts between build stages.
|
||||
Downside: slow.
|
||||
|
||||
NFS could be used as a shared file system.
|
||||
Downside: slow
|
||||
|
||||
Ceph.
|
||||
Downside: seems complicated.
|
||||
|
||||
Artifact retention & backups
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
CENGN: not clear whether independent isolated backup storage is available
|
||||
We could save backups on one of the 2 build servers, making sure important
|
||||
files (released builds etc) are saved on 2 physical machines.
|
||||
|
||||
StarlingX vs Kubernetes
|
||||
^^^^^^^^^^^^^^^^^^^^^^^
|
||||
* Once StarlingX switches to Debian, the build server would have to be
|
||||
re-imaged, this will cause disruption in daily builds.
|
||||
* We do not need many of the functions that StarlingX provides, k8s is
|
||||
sufficient.
|
||||
* StarlingX is not optimized for running build jobs.
|
||||
* If we use k8s we should pick a stable base OS with a long shelf life to
|
||||
avoid upgrading it for longer, while being able to upgrade k8s at will.
|
||||
* If we use StarlingX we should pick the latest official release (6.0).
|
||||
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
None
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
None for StarlingX deployments. Kubernetes clusters used for builds have
|
||||
security implications that will have to be considered.
|
||||
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
None
|
||||
|
||||
Performance impact
|
||||
------------------
|
||||
|
||||
None
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
None
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
Current workflow based on Minikube will continue being supported. Organizations
|
||||
will gain the ability to take advantage of full Kubernetes installations for
|
||||
centralized builds.
|
||||
|
||||
Upgrade impact
|
||||
--------------
|
||||
|
||||
None for StarlingX. Kubernetes upgrades are covered in [8].
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
* Davlet Panech - dpanech
|
||||
* Luis Sampaio - lsampaio
|
||||
|
||||
|
||||
Repos impacted
|
||||
--------------
|
||||
|
||||
starlingx/tools
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
See storyboard.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
None
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
As the scope of this spec is restricted to the building of StarlingX
|
||||
it does not introduce any additional runtime testing requirements. As
|
||||
this change is proposed to take place alongside the move to Debian,
|
||||
full runtime testing is expected related to that spec.
|
||||
|
||||
Building under full Kubernetes will require validation to ensure similar
|
||||
outcomes as were expected when building in Minikube environment.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
StarlingX Build Guide
|
||||
https://docs.starlingx.io/developer_resources/build_guide.html -
|
||||
add instructions for full Kubernetes environments.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
[1] StarlingX: Debian Build Spec --
|
||||
https://docs.starlingx.io/specs/specs/stx-6.0/approved/starlingx_2008846_debian_build.html
|
||||
|
||||
[2] Tekton, a CI pipeline engine for k8s --
|
||||
https://tekton.dev/
|
||||
|
||||
[4] Kaniko, a tool to build container images from a Dockerfile, inside a
|
||||
container or Kubernetes cluster --
|
||||
https://github.com/GoogleContainerTools/kaniko
|
||||
|
||||
[3] MinIO, an Amazon S3 - compatible object storage system --
|
||||
https://min.io/
|
||||
|
||||
[5] Sysbox, a container runtime that sits below Docker --
|
||||
https://github.com/nestybox/sysbox
|
||||
|
||||
[6] Docker in Docker --
|
||||
https://hub.docker.com/_/docker/
|
||||
|
||||
[7] Using Docker-in-Docker for your CI or testing environment? Think twice. --
|
||||
https://jpetazzo.github.io/2015/09/03/do-not-use-docker-in-docker-for-ci/
|
||||
|
||||
[8] Kubernetes - https://kubernetes.io/docs/home/
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
.. list-table:: Revisions
|
||||
:header-rows: 1
|
||||
|
||||
* - Release Name
|
||||
- Description
|
||||
* - STX-7.0
|
||||
- Introduced
|
Loading…
Reference in New Issue