Power Management Configuration

Kubernetes Power Management is a tool that allows you to expose and use
the power control technologies present in some processors in a Kubernetes
cluster. Its main application is directed to energy control in situations
of workloads in known periods and energy optimization even in high
performance workloads.

Story: 2010737
Task: 47983
Change-Id: I596577571e5b66ddf81f86687c05350e09bbed3c
Signed-off-by: Romão Martines <romaomatheus.martinesdejesus@windriver.com>
This commit is contained in:
Romão Martines 2023-05-12 09:56:46 -04:00 committed by Davi Frossard
parent d3421cc962
commit 0b7e2417fc
1 changed files with 296 additions and 0 deletions

View File

@ -0,0 +1,296 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License. http://creativecommons.org/licenses/by/3.0/legalcode
=================================================
Kubernetes Power Manager integration on StarlingX
=================================================
Storyboard: `#2010737`_
The objective of this spec is to introduce Configurable Power Management in
StarlingX Platform.
Problem description
===================
StarlingX, on its current versions, does not offers a comprehensive set of
features for energy management. There are important limitations on the maximum
frequency control that the processors can assume. Currently, this control is
generalized, i.e., it does not allow individualized control of the CPU/Cores.
Users, however, have energy management needs with greater scope and higher
granularity, focused on containerized applications using energy profiles
individually by Core and/or application. Among the user's needs, we can
highlight the control of acceptable frequency ranges (minimum and maximum
frequency) per core, the behavior of the core in this range (governor), which
energy levels (c-states) a given core can access, as well as the behavior of
the system in the face of workloads with known intervals/demands.
`Kubernetes Power Manager`_ is a tool that allows you to expose and use the
powercontrol technologies present in some processors in a Kubernetes cluster.
Its main application is directed to energy control in situations of workloads
in known periods and energy optimization even in high performance workloads.
By controlling frequency (p-states) and activation levels of CPU functions
(c-states), the tool allows each core to be individually controlled according
to the needs of each application's workload. Due to this feature and its
adherence to the StarlingX proposal, this spec seeks to observe the
applicability, requirements and quality of operation of the Kubernetes Power
Manager and its integration on StarlingX platform.
Below, there is a study about Power Management and all required
changes regarding StarlingX Platform in :ref:`Proposed Changes` and :ref:`Work
Item`
Kubernetes Power Management Components
--------------------------------------
* Power Manager: controls the nodes, serves as a manager, or source of truth
gathering information on the energy profiles applied to each node
* Power Config Controller: part of the power manager, is responsible for
evaluating the presence of a default power configuration for the node and,
when present, starting the Power Node Agent
* Power Config: describes the energy configuration of a given node
indicating one or more profiles that can be used
* Power Node Agent: applied to each node, it is a daemonset responsible for
managing the energy profiles applied. Communicates with Power Manager to
establish power policies to be applied to the node
* Power Profile: power profiles that establish CPU operating frequency
ranges, Energy Performance Preference (EPP), and governor. The profile has
a generic aspect, that is, it only describes a possible style of energy
control, and its application is the responsibility of Power Workload.The
pods' deployment files, or Power Config files, can indicate the profiles
they want to use by including the device power.intel.com/<POWERPROFILE>.
It is also important to note that all CPUs not assigned to a specific
power profile are pooled in a profile known as the "Shared". This profile
must be created manually by the user
* Power Workload: responsible for applying a Power Profile. Its scope is
set automatically, when a pod requests a default power profile, or via a
configuration file that describes the affected CPUs. Preset profiles have
workloads created automatically. The Shared Power Profile, and other
personalized profiles, does not have an automatically assigned workload
(need to be created manually)
Use Cases
---------
Manage power configurations through StarlingX Platform, controlling frequency
and also activation levels of CPU functions.
.. _Proposed Changes:
Proposed change
===============
This change proposes that the energy management system be activated according
to the user's needs:
- When disabled, system behavior will not be changed;
- When installed, the nodes can have the power management system enabled or
disabled.
The energy management system is based on four standard energy consumption
profiles and possible user-customized profiles. Whenever you want a certain
application to have high performance, for example, the energy profile
"performance" must be declared in its deployment file. The power manager, in
turn, will configure the profile on the Core assigned by Kubernetes to the
Pod.
The default profiles "performance", "balanced-performance", "balanced-power"
and "power" will be automatically configured during the installation process.
It will be up to the user to enable new profiles as needed. All Cores not
assigned to a Pod, or idle, will have their power profile set to wide
frequency (minimum equals the minimum supported by the processor and maximum
equals the maximum supported), as currently occurs in the system without power
control.
The standard energy level (c-state) that the cores will assume will also be
assigned. The user will be able to change the c-states in application, or
isolated Cores according to his needs.
All the functionalities accessible to the user can be controlled by applying
appropriate yamls files.
It is important to note that the user will be free to modify the energy
settings of the cores intended for system support (platform cores), but all
these settings will be overwritten during the lock/unlock process, in order to
maintain the integrity of the system.
Alternatives
------------
None
Data model impact
-----------------
None
REST API impact
---------------
None
Security impact
---------------
None
Other end-user impact
---------------------
Some SysInv/Horizon commands may be deprecated with Kubernetes Power Manager
integration. Some examples may include (not exaustive) the
`Host CPU MHz Parameters Configuration`_.
Performance Impact
------------------
None
Another deployer impact
-----------------------
None
Developer impact
----------------
None
Upgrade impact
--------------
None
Implementation
==============
Assignee(s)
-----------
Primary assignee:
* Guilherme Batista Leite (guilhermebatista)
Other contributors:
* Davi Frossard (dbarrosf)
* Eduardo Alberti (ealberti)
* Fabio Studyny Higa (fstudyny)
* Pedro Antônio de Souza Silva (pdesouza)
* Reynaldo Patrone Gomes Filho (rpatrone)
* Romão Martines (rmartine)
* Thiago Antonio Miranda (tamiranda)
Repos Impacted
--------------
List repositories in StarlingX that are impacted by this spec:
.. _Work Item:
Work Items
----------
Investigations and design
*************************
* Investigation and evaluation of CPU Power Manager architecture
and requirements
* Evaluation of platform and application control for p-states and
c-states
* Design proposal and review for p-state and c-state control
* Minor customizations for Kubernetes Power Manager may also be introduced, for
instance, the modification to accept the use of isolated CPUs.
* Resource requirements of new Power Manager containers
* System platform CPU and Memory Scaling requirements update based on C6
enablement (if required)
Power Manager Integration
*************************
* Kubernetes Power Manager installation via system application
(power-management)
* Default policy configuration (platform and applications)
Dependencies
============
* `Kubernetes Power Manager`_ release v2.2.0
* `Node Feature Discovery`_ v0.13.1
Testing
=======
System configuration
--------------------
The system configurations that we are assuming for testing are:
* AIO-SX
* Standard
Test Scenarios
--------------
We elected some tests which should be defined or changed to cover this spec:
* The usual unit testing in the impacted code areas
* Full system regression of all StarlingX applications functionality (system
application commands, lifecycle actions, etc)
* Performance testing to identify and address any performance impacts.
* Backup and restore tests
* Upgrade and rollback tests
Documentation Impact
====================
The end-user documentation will need do be changed, adding Power Manager
application deployment and configuration. Also it needs to be added the
customization of default and new policies.
Documentation should also review the processes of Pod Workload for its
integration and usage considering the Power Management changes.
References
==========
#. `Kubernetes Power Manager`_
History
=======
.. list-table:: Revisions
:header-rows: 1
* - Release Name
- Description
* - stx-9.0
- Introduced
.. Abbreviations
.. |EPP| replace:: :abbr:`EPP (Energy Performance Preference)`
.. |PLL| replace:: :abbr:`PLL (Phase Locked Loop)`
.. |PECI| replace:: :abbr:`PECI (Platform Environment Control Interface)`
.. |MSR| replace:: :abbr:`MSR (Model-Specific Register)`
.. |MSRs| replace:: :abbr:`MSRs (Model-Specific Registers)`
.. Links
.. _#2010737: https://storyboard.openstack.org/#!/story/2010737
.. _Kubernetes Power Manager: https://github.com/intel/kubernetes-power-manager
.. _Node Feature Discovery: https://github.com/kubernetes-sigs/node-feature-discovery
.. _Host CPU MHz Parameters Configuration: https://docs.starlingx.io/node_management/kubernetes/host-cpu-mhz-parameters-configuration-d9ccf907ede0.html