Kubernetes Power Manager integration on StarlingX

Kubernetes Power Manager is a Kubernetes Operator designed to expose and
utilize the power control technologies present in some processors in a
Kubernetes cluster. Its main application is directed to power control in
situations of workloads in known periods and power optimization even in high
performance workloads.

Story: 2010737
Task: 47983
Change-Id: I596577571e5b66ddf81f86687c05350e09bbed3c
Signed-off-by: Romão Martines <romaomatheus.martinesdejesus@windriver.com>
This commit is contained in:
Romão Martines 2023-05-12 09:56:46 -04:00 committed by Davi Frossard
parent d3421cc962
commit 2231fb7df8
1 changed files with 402 additions and 0 deletions

View File

@ -0,0 +1,402 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License. http://creativecommons.org/licenses/by/3.0/legalcode
=================================================
Kubernetes Power Manager integration on StarlingX
=================================================
Storyboard: `#2010737`_
The objective of this spec is to introduce Configurable Power Management in
StarlingX Platform.
Problem description
===================
StarlingX, on its current versions, does not offer a comprehensive set of
features for power management. There are important limitations on the maximum
frequency control that the processors can assume. Currently, this control is
generalized, i.e., it does not allow individualized control of the CPUs/cores.
Users, however, have power management needs with greater scope and higher
granularity, focused on containerized applications using power profiles
individually by core and/or application. Among the user's needs, we can
highlight the control of acceptable frequency ranges (minimum and maximum
frequency) per core, the behavior of the core in this range (governor), which
power levels (c-states) a given core can access, as well as the behavior of
the system in the face of workloads with known intervals/demands.
`Kubernetes Power Manager`_ is a Kubernetes Operator designed to expose and
utilize the power control technologies present in some processors in a
Kubernetes cluster. Its main application is directed to power control in
situations of workloads in known periods and power optimization even in high
performance workloads.
By controlling CPU performance states (P-states) and CPU idle states
(C-states), the tool allows each core to be individually controlled according
to the needs of each application's workload. Due to this feature and its
adherence to the StarlingX proposal, this spec seeks to observe the
applicability, requirements, quality of operation of the Kubernetes Power
Manager and its integration on StarlingX platform.
Below, there is a study about Kubernetes Power Manager and all required
changes regarding StarlingX Platform in :ref:`Proposed Changes` and :ref:`Work
Item`
Kubernetes Power Manager Components
-----------------------------------
* Power Manager: controls the nodes, serves as a manager, or source of truth
gathering information on the power profiles applied to each node
* Power Config Controller: part of the power manager, is responsible for
evaluating the presence of a default power configuration for the node and,
when present, starting the Power Node Agent
* Power Config: describes the power configuration of a given node
indicating one or more profiles that can be used
* Power Node Agent: per-node pod managed by a DaemonSet, responsible for
managing the power profiles applied. Communicates with Power Manager to
establish power policies to be applied to the node
* Power Profile: establishes CPU operating frequency ranges, Energy
Performance Preference (EPP), and governor. The profile has a generic
aspect, that is, it only describes a possible style of power control, and
its application is the responsibility of Power Workload. The pods'
deployment files, or Power Config files, can indicate the profiles they
want to use by including the device power.intel.com/<POWERPROFILE>.
It is also important to note that all CPUs not assigned to a specific
power profile are pooled in a profile known as the "Shared". This profile
must be created manually by the user
* Power Workload: responsible for applying a Power Profile. Its scope is
set automatically when a pod requests a default power profile or via a
configuration file that describes the affected CPUs. Preset profiles have
workloads created automatically. The Shared Power Profile, and other
personalized profiles, does not have an automatically assigned workload
(needs to be created manually)
Use Cases
---------
After installing and enabling the Kubernetes Power Manager, the user will be
able to indicate which power settings will be assigned to the cores of a given
application. Below, 3 examples are presented, describing common situations
of use. Further details can be consulted in the `Kubernetes Power Manager`_
documentation.
* Example A: The user wants the application to have high performance. Fragment
of *pod_spec.yaml* to be deployed:
.. code-block:: yaml
# (...)
resources:
requests:
cpu: "4"
memory: "1G"
power.intel.com/performance: "4"
limits:
cpu: "4"
memory: "1G"
power.intel.com/perfomance: "4"
# (...)
* Example B: On a server with only two c-state levels available (C3 and C4),
the user wants the high-performance profiled cores to be kept at higher
levels (C3) as well as idle cores to access the lowest level (C4). Fragment
of the *c-state.yaml* profile configuration file:
.. code-block:: yaml
apiVersion: power.intel.com/v1
kind: CStates
metadata:
name: worker-0
spec:
# (...)
sharedPoolCStates:
C3: false
C4: true
exclusivePoolCStates:
performance:
C3: true
C4: false
# (...)
* Example C: User wants to create a profile specific to their needs.
First step: deploy *custom-profile.yaml*. In this case the profile has the
name "one-profile". The min, max, epp, and governor can be set by the
user.
.. code-block:: yaml
apiVersion: power.intel.com/v1
kind: PowerProfile
metadata:
name: one-profile
namespace: intel-power
spec:
name: one-profile
max: 2200
min: 2000
epp: power
governor: powersave
Second step: deploy the *pod_spec.yaml*. Fragment to be deployed:
.. code-block:: yaml
# (...)
resources:
requests:
cpu: "1"
memory: "1G"
power.intel.com/one-profile: "1"
limits:
cpu: "1"
memory: "1G"
power.intel.com/one-profile: "1"
# (...)
.. _Proposed Changes:
Proposed change
===============
The Kubernetes Power Manager, when disabled, will not offer any change in
StarlingX standard behavior (keeping the system running at maximum performance
the entire time). When activated, however, the power management system will
allow the user to apply power settings as needed under conditions described
below.
The power manager system is based on four standard power profiles and possible
user-customized profiles. Whenever a certain application needs high
performance, for example, the power profile "performance" must be declared in
its deployment file. The power manager, in turn, will configure the profile on
the CPU(s) assigned by Kubernetes to the Pod.
The default profiles "performance", "balanced-performance", "balanced-power"
and "power" will be automatically configured during the installation process.
It will be up to the user to create new profiles as needed. All cores not
assigned to a Pod, or idle, will have their power profile set to wide
frequency (minimum equals the minimum supported by the processor and maximum
equals the maximum supported), as currently occurs in the system without power
control.
The standard power level (c-state) that the cores will assume will also be
assigned. The user will be free to change the c-states individually,
indicating which states a certain core can assume, or by group, indicating
which states the cores of a certain power profile can assume.
All the functionalities accessible to the user can be controlled by applying
appropriate yaml files.
It is important to note that the user will be free to modify the energy
settings of the cores intended for system support (platform cores), but all
these settings will be overwritten during the lock/unlock process, to maintain
the integrity of the system.
After installing the Kubernetes Power Manager, it will be necessary to enable
it on the desired hosts by setting the label "power-management=enabled",
which will trigger the removal of the limitation of C-state C0 on nodes where
the worker function is present.
Alternatives
------------
None
Data model impact
-----------------
None
REST API impact
---------------
None
Security impact
---------------
None
Other end-user impact
---------------------
Some SysInv/Horizon commands may be deprecated with Kubernetes Power Manager
integration (see all user configurable parameters on
`Host CPU MHz Parameters Configuration`_).
In case where the user tries to use these deprecated parameters with
Kubernetes Power Manager enabled, the system should not accept these actions
and prompt the user.
Performance Impact
------------------
Enabling Kubernetes Power Manager on StarlingX can cause performance impacts
related to power consumption, latency, and throughput. Here are some
considerations for these aspects:
* Power Consumption: By actively monitoring and controlling power usage
through policies, Kubernetes Power Manager can optimize power consumption
based on workload demands, potentially reducing overall power consumption in
the cluster. On the other hand, incorrect or inconsistent configuration can
lead to degraded performance or increased power consumption.
* Latency: C-States range from C0 to Cn. C0 indicates an active state. All
other C-states (C1-Cn) represent idle sleep states with different parts of
the processor powered down. As the C-States get deeper, the exit latency
duration becomes longer (the time to transition to C0) and the power savings
becomes greater. This could slightly increase the time required for
resource management operations, such as scaling, scheduling, as well as
platform and end-user tasks. However, the appropriate configuration of Power
Manager can reduce the magnitude of this impact.
* Throughput: The impact on throughput depends on how well Kubernetes Power
Manager is configured to handle resource allocation while considering power
constraints, potentially optimizing the cluster's performance and increasing
throughput. However, if Power Manager makes suboptimal decisions, it may
impact throughput negatively.
The exact performance impact will depend on several factors such as workload
characteristics, cluster configuration, and the specific configuration of Power
Manager. Conducting thorough testing in the end-user environment is recommended
to understand the precise effects on power consumption, latency, throughput,
and other aspects.
Another deployer impact
-----------------------
None
Developer impact
----------------
None
Upgrade impact
--------------
None
Implementation
==============
Assignee(s)
-----------
Primary assignee:
* Guilherme Batista Leite (guilhermebatista)
Other contributors:
* Davi Frossard (dbarrosf)
* Eduardo Alberti (ealberti)
* Fabio Studyny Higa (fstudyny)
* Pedro Antônio de Souza Silva (pdesouza)
* Reynaldo Patrone Gomes Filho (rpatrone)
* Romão Martines (rmartine)
* Thiago Antonio Miranda (tamiranda)
Repos Impacted
--------------
* starlingx/docs
* starlingx/config
* starlingx/stx-puppet
* starlingx/app-kubernetes-power-manager (new)
.. _Work Item:
Work Items
----------
Investigations and design
*************************
* Investigation and evaluation of CPU Power Manager architecture
and requirements
* Evaluation of platform and application control for p-states and
c-states
* Design proposal and review for p-state and c-state control
* Minor customizations for Kubernetes Power Manager may also be introduced,
for instance, the modification to accept the use of isolated CPUs.
Kubernetes Power Manager Integration
************************************
* Installation via system application
* Default policy configuration:
* Default p-state configuration and policy for platform cores (p-states
enabled with full frequency range)
* Default c-state configuration and policy for platform cores (c-states
enabled with maximum idle state, limited to C6)
* Default p-state configuration and policy for application cores (p-states
enabled with full frequency range)
* Default c-state configuration and policy for application cores (c-states
enabled with maximum idle state, limited to C1)
Dependencies
============
* `Kubernetes Power Manager`_ release v2.2.0
* `Node Feature Discovery`_ v0.13.1
Testing
=======
System configuration
--------------------
The system configurations that we are assuming for testing are:
* AIO-SX
* Standard
Test Scenarios
--------------
* Functional tests for Kubernetes Power Manager and its possible
customizations.
* The usual unit testing in the impacted code areas.
* Performance testing to identify and address any performance impacts.
* Backup and restore tests.
* Upgrade test to verify behavior of deprecated Host CPU MHz parameters.
Documentation Impact
====================
The end-user documentation will need to be changed, adding Kubernetes Power
Manager application deployment and configuration, as well as the customization
of default and new policies.
References
==========
#. `Kubernetes Power Manager`_
History
=======
.. list-table:: Revisions
:header-rows: 1
* - Release Name
- Description
* - stx-9.0
- Introduced
.. Abbreviations
.. |EPP| replace:: :abbr:`EPP (Energy Performance Preference)`
.. Links
.. _#2010737: https://storyboard.openstack.org/#!/story/2010737
.. _Kubernetes Power Manager: https://github.com/intel/kubernetes-power-manager
.. _Node Feature Discovery: https://github.com/kubernetes-sigs/node-feature-discovery
.. _Host CPU MHz Parameters Configuration: https://docs.starlingx.io/node_management/kubernetes/host-cpu-mhz-parameters-configuration-d9ccf907ede0.html