Merge "Add Intel Ethernet Operator spec"
This commit is contained in:
commit
509f2d6fa8
|
@ -0,0 +1,420 @@
|
||||||
|
Integration of Intel Ethernet Operator to StarlingX Platform
|
||||||
|
============================================================
|
||||||
|
|
||||||
|
Storyboard:
|
||||||
|
https://storyboard.openstack.org/#!/story/2010562
|
||||||
|
|
||||||
|
In a cloud environment, network interface adapters require a cloud based
|
||||||
|
management system. These adapters may have advanced functionality which is
|
||||||
|
best combined under a single operator.
|
||||||
|
|
||||||
|
Problem description
|
||||||
|
===================
|
||||||
|
|
||||||
|
Firmware for Network Adapters may need management. Network Adapter
|
||||||
|
personalization may need management. Activation of flow rules to ensure
|
||||||
|
interfaces reach pods is also required.
|
||||||
|
|
||||||
|
Use Cases
|
||||||
|
---------
|
||||||
|
|
||||||
|
* Update of Firmware on interface adapters
|
||||||
|
* Update of Device Dynamic Personalization on interface adapters
|
||||||
|
* Update of Flow Configuration on interface adapters to allow steering
|
||||||
|
|
||||||
|
Dynamic Device Personalization (DDP) is the on-chip programmable pipeline
|
||||||
|
which allows deep and diverse protocol header processing. Flow configuration
|
||||||
|
allow the steering of traffic to particular VFs on the node.
|
||||||
|
|
||||||
|
Proposed change
|
||||||
|
===============
|
||||||
|
|
||||||
|
The Intel Ethernet Operator (IEO) will allow Intel E810 Series NICs
|
||||||
|
firmware to be updated in a container environment. Nodes will be drained,
|
||||||
|
taken out of service and restarted as required by the update. Firmware and
|
||||||
|
DDP packages can be downloaded from a suitable HTTP Server
|
||||||
|
(configurable in EthernetNodeConfig Custom Resource).
|
||||||
|
|
||||||
|
Intel Ethernet Operator also requires some other plugins and operators:
|
||||||
|
|
||||||
|
Intel Ethernet operator requires SR-IOV Device Plugin which makes SR-IOV
|
||||||
|
resources available in kubernetes. For ease of configuration SR-IOV Network
|
||||||
|
Operator is also required. SR-IOV Network Operator requires the use of Node
|
||||||
|
Feature Discovery. Both SR-IOV Network Operator and Node Feature Discovery
|
||||||
|
are installed as dependencies along IEO in intel-ethernet-operator namespace.
|
||||||
|
|
||||||
|
Flow rules require the inclusion of the (Unified Flow Tool) UFT server
|
||||||
|
application. UFT applies the flow rules and is called using the
|
||||||
|
DPDK rte_flow API. The API supports the Switch Filter rules, supported
|
||||||
|
by DPDK rte_flow. UFT is included as part of the IEO installation.
|
||||||
|
|
||||||
|
Common features
|
||||||
|
---------------
|
||||||
|
|
||||||
|
Within the ethernet operator on the control node, the controller deploys the
|
||||||
|
first asset (ethernet-discovery = labeler) - this is being deployed as a pod
|
||||||
|
on each node, the labeler marks the node if a supported device is connected.
|
||||||
|
The controller deploys a compatibility map - a config file specifying which
|
||||||
|
FW/DDP/Kernel versions can work together.
|
||||||
|
|
||||||
|
The controller deploys the Ethernet-daemon (FW/DDP daemon) as a Daemonset,
|
||||||
|
nodes with appropriate label get a pod deployed on them, others don't.
|
||||||
|
The Ethernet daemon checks for Node configuration, if one is not found it
|
||||||
|
creates it.
|
||||||
|
The Daemon reconciles in a loop, gathers the status of the required components
|
||||||
|
(found devices, PCI address, MAC, FW, DDP version etc) and updates the Node
|
||||||
|
configuration with a status.
|
||||||
|
User can now get a status of all Node configs, and status of a specific Node
|
||||||
|
config.
|
||||||
|
|
||||||
|
Firmware and DDP upgrade
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
User uploads desired DDP package and/or nvmupdate package to a HTTP server
|
||||||
|
accessible by the cluster (the HTTP server and mechanism to upload are out of
|
||||||
|
scope of operator). User can now apply a new cluster configuration with
|
||||||
|
preferred configuration, this is broken down by the Ethernet Controller into
|
||||||
|
smaller Node configurations, the configurations are updated.
|
||||||
|
The Ethernet Daemon reconciles in a loop for an update, if condition
|
||||||
|
(fields in applied EthernetClusterConfig CRD) is unchanged it ignores,
|
||||||
|
if new conditions for other nodes are detected it ignores them, when
|
||||||
|
a condition change is detected for particular daemon it acts on it,
|
||||||
|
it will verify the condition and deny change if it cannot be met.
|
||||||
|
If condition can be met it will run appropriate functions/actions to update
|
||||||
|
the node to the desired condition (ie DDP/FW update) - it will try to download
|
||||||
|
packages from specified address from HTTP server, it will elect a leader to
|
||||||
|
act as a controller, it will cordon off and drain the node, it will proceed
|
||||||
|
with updates, it will reboot the node, uncordon it and release the leadership.
|
||||||
|
Once any update to configuration is done, it will update the node
|
||||||
|
configuration status. Once the update is finished the user is able to get the
|
||||||
|
status of the update and status of the node.
|
||||||
|
|
||||||
|
Flow Configuration
|
||||||
|
------------------
|
||||||
|
|
||||||
|
To allow the Flow Configuration feature to compose the flow rules for the
|
||||||
|
network card's traffic, the deployment must use a trusted virtual function
|
||||||
|
(VF) from each physical function (PF). Usually it is the first VF (VF0) for
|
||||||
|
each PF that has trust mode enabled and then bound to the vfio-pci driver.
|
||||||
|
This VF pool must be created by the user and be allocatable as a Kubernetes
|
||||||
|
resource.
|
||||||
|
|
||||||
|
Rules can be written like rte_flow and will allow deep matching of packet type
|
||||||
|
flows to interfaces associated with pods on a cluster. Rules can be written
|
||||||
|
for cluster and pod. During pod scheduling they will be instantiated on
|
||||||
|
a node to configure the flow offload hardware on interface to target a pod
|
||||||
|
attached via a particular VF.
|
||||||
|
|
||||||
|
Alternatives
|
||||||
|
============
|
||||||
|
|
||||||
|
It's possible to connect to each node, untar and install the firmware and
|
||||||
|
Device Profiles. Similarly, flow offloads could possibly be done individually
|
||||||
|
on each node.
|
||||||
|
|
||||||
|
Data model impact
|
||||||
|
=================
|
||||||
|
|
||||||
|
IEO introduces following CRDs on the cluster:
|
||||||
|
- EthernetClusterConfig
|
||||||
|
- FlowConfigNodeAgentDeployment
|
||||||
|
- NodeFlowConfig
|
||||||
|
- ClusterFlowConfig
|
||||||
|
- EthernetNodeConfig (NICs configuration status, not created by user)
|
||||||
|
|
||||||
|
EthernetClusterConfig
|
||||||
|
=====================
|
||||||
|
|
||||||
|
.. code-block:: yaml
|
||||||
|
|
||||||
|
apiVersion: ethernet.intel.com/v1
|
||||||
|
kind: EthernetClusterConfig
|
||||||
|
metadata:
|
||||||
|
name: config
|
||||||
|
spec:
|
||||||
|
nodeSelectors:
|
||||||
|
kubernetes.io/hostname: <hostname>
|
||||||
|
deviceSelector:
|
||||||
|
pciAddress: "<pci-address>"
|
||||||
|
deviceConfig:
|
||||||
|
fwURL: "<URL_to_firmware>"
|
||||||
|
fwChecksum: "<file_checksum_SHA-1_hash>"
|
||||||
|
ddpURL: "<URL_to_DDP>"
|
||||||
|
ddpChecksum: "<file_checksum_SHA-1_hash>"
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
----------
|
||||||
|
|
||||||
|
* ``name``: Name of the specific config
|
||||||
|
* ``kubernetes.io/hostname``: Hostname containing cards to be updated
|
||||||
|
* ``fwURL``: Accessible URL for the file. Proxy may be needed
|
||||||
|
* ``fwChecksum``: Expected checksum of the firmware file
|
||||||
|
* ``ddpURL``: Accessible URL for the DDP file. Proxy may be needed
|
||||||
|
* ``fwChecksum``: Expected checksum of the DDP file
|
||||||
|
|
||||||
|
FlowConfigNodeAgentDeployment
|
||||||
|
=============================
|
||||||
|
|
||||||
|
.. code-block:: yaml
|
||||||
|
|
||||||
|
apiVersion: flowconfig.intel.com/v1
|
||||||
|
kind: FlowConfigNodeAgentDeployment
|
||||||
|
metadata:
|
||||||
|
labels:
|
||||||
|
control-plane: flowconfig-daemon
|
||||||
|
name: flowconfig-daemon-deployment
|
||||||
|
namespace: intel-ethernet-operator
|
||||||
|
spec:
|
||||||
|
DCFVfPoolName: openshift.io/cvl_uft_admin
|
||||||
|
NADAnnotation: sriov-cvl-dcf
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
----------
|
||||||
|
|
||||||
|
* ``name``: Name of the FlowConfigNodeAgentDeployment
|
||||||
|
* ``DCFVfPoolName``: Used SriovNetworkNodePolicy name
|
||||||
|
* ``NADAnnotation``: Used SriovNetwork name
|
||||||
|
|
||||||
|
NodeFlowConfig
|
||||||
|
===============
|
||||||
|
|
||||||
|
.. code-block:: yaml
|
||||||
|
|
||||||
|
apiVersion: flowconfig.intel.com/v1
|
||||||
|
kind: NodeFlowConfig
|
||||||
|
metadata:
|
||||||
|
name: worker-01
|
||||||
|
spec:
|
||||||
|
rules:
|
||||||
|
- pattern:
|
||||||
|
- type: RTE_FLOW_ITEM_TYPE_ETH
|
||||||
|
- type: RTE_FLOW_ITEM_TYPE_IPV4
|
||||||
|
spec:
|
||||||
|
hdr:
|
||||||
|
src_addr: 10.56.217.9
|
||||||
|
mask:
|
||||||
|
hdr:
|
||||||
|
src_addr: 255.255.255.255
|
||||||
|
- type: RTE_FLOW_ITEM_TYPE_END
|
||||||
|
action:
|
||||||
|
- type: RTE_FLOW_ACTION_TYPE_DROP
|
||||||
|
- type: RTE_FLOW_ACTION_TYPE_END
|
||||||
|
portId: 0
|
||||||
|
attr:
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
----------
|
||||||
|
|
||||||
|
* ``name``: Name of the config - needs to match node name
|
||||||
|
* ``pattern: type``: Header part to match on
|
||||||
|
* ``pattern: spec & mask``: Addresses to match for the rules
|
||||||
|
* ``action``: Alter the fate of matching traffic, its contents or properties
|
||||||
|
* ``attr``: Flow rule priority level
|
||||||
|
* ``portID``: Information to identify port on a node
|
||||||
|
|
||||||
|
ClusterFlowConfig
|
||||||
|
=================
|
||||||
|
|
||||||
|
.. code-block:: yaml
|
||||||
|
|
||||||
|
apiVersion: flowconfig.intel.com/v1
|
||||||
|
kind: ClusterFlowConfig
|
||||||
|
metadata:
|
||||||
|
name: pppoes-sample
|
||||||
|
spec:
|
||||||
|
rules:
|
||||||
|
- pattern:
|
||||||
|
- type: RTE_FLOW_ITEM_TYPE_ETH
|
||||||
|
- type: RTE_FLOW_ITEM_TYPE_IPV4
|
||||||
|
spec:
|
||||||
|
hdr:
|
||||||
|
src_addr: 10.56.217.9
|
||||||
|
mask:
|
||||||
|
hdr:
|
||||||
|
src_addr: 255.255.255.255
|
||||||
|
- type: RTE_FLOW_ITEM_TYPE_END
|
||||||
|
action:
|
||||||
|
- type: to-pod-interface
|
||||||
|
conf:
|
||||||
|
podInterface: net1
|
||||||
|
attr:
|
||||||
|
ingress: 1
|
||||||
|
priority: 0
|
||||||
|
podSelector:
|
||||||
|
matchLabels:
|
||||||
|
app: vagf
|
||||||
|
role: controlplane
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
----------
|
||||||
|
|
||||||
|
* ``name``: Name of the config
|
||||||
|
* ``pattern: type``: Header part to match on
|
||||||
|
* ``pattern: spec & mask``: Addresses to match for the rules
|
||||||
|
* ``action``: Alter the fate of matching traffic, its contents or properties
|
||||||
|
* ``attr``: Flow rule priority level
|
||||||
|
* ``podSelector``: Labels associated with the particular pod
|
||||||
|
|
||||||
|
NOTE: Most of the objects parameters names are consistent with the names given
|
||||||
|
in the official dpdk rte flow documentation. For the full description of
|
||||||
|
Generic flow API see https://doc.dpdk.org/guides/prog_guide/rte_flow.html.
|
||||||
|
|
||||||
|
During the course of execution, ClusterFlowConfig rules are broken down to
|
||||||
|
NodeFlowConfig rules. NodeFlowConfig rules can also be written manually.
|
||||||
|
|
||||||
|
REST API impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
Standard extension of K8s APIs based on introduction of above CRDs.
|
||||||
|
|
||||||
|
Security impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
Current/Existing K8S Authentication and Authorization apply to standard
|
||||||
|
extension of K8S APIs based on introduction of IEO CRDs.
|
||||||
|
|
||||||
|
Other end user impact
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
End user will have the capability to:
|
||||||
|
- control firmware and DDP packages
|
||||||
|
- configure flow rules
|
||||||
|
- display configuration status
|
||||||
|
on intel ethernet devices.
|
||||||
|
|
||||||
|
Performance Impact
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Using the Intel Ethernet Operator, service pods will be running on master
|
||||||
|
and worker nodes all the time which will consume some amount of CPU and memory
|
||||||
|
resource from cluster housekeeping, which we believe to be negligible.
|
||||||
|
For a periodic reconciling, communication between controller-manager and node
|
||||||
|
daemons may consume network resources as well, assuming negligible.
|
||||||
|
|
||||||
|
Other deployer impact
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
Developer impact
|
||||||
|
----------------
|
||||||
|
|
||||||
|
In StarlingX 8.0 and future releases /lib/firmware directory is read-only.
|
||||||
|
This creates problem for any customer that would want to use DDP profile
|
||||||
|
other than one that comes preinstalled. Intel ice driver looks for DDP package
|
||||||
|
named intel/ice/ddp/stx-ice.pkg in default firmware search paths - which are
|
||||||
|
/lib/firmware and /lib/firmware/updates. Both of these paths are immutable
|
||||||
|
so currently there is no way to change DDP package in use. Solution to this
|
||||||
|
is alternate firmware search path that is already in kernel
|
||||||
|
https://docs.kernel.org/driver-api/firmware/fw_search_path.html. This feature
|
||||||
|
can be enabled by adding suitable boot parameter. Contribution that adds that
|
||||||
|
to StarlingX is already made (in stx-puppet repository).
|
||||||
|
|
||||||
|
Upgrade impact
|
||||||
|
--------------
|
||||||
|
|
||||||
|
None. This is an optional operator.
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
==============
|
||||||
|
|
||||||
|
Assignee(s)
|
||||||
|
-----------
|
||||||
|
|
||||||
|
Primary assignee:
|
||||||
|
Rafal Lal
|
||||||
|
|
||||||
|
Other contributors:
|
||||||
|
Kevin Clarke
|
||||||
|
|
||||||
|
Repos Impacted
|
||||||
|
--------------
|
||||||
|
|
||||||
|
A new system-application repo will be created for the definition and building
|
||||||
|
of the intel-ethernet-operator application.
|
||||||
|
|
||||||
|
Work Items
|
||||||
|
----------
|
||||||
|
|
||||||
|
Create intel-ethernet-operator application package
|
||||||
|
|
||||||
|
Integrate intel-ethernet-operator application to FluxCD. Add application
|
||||||
|
upload/apply/remove/delete commands.
|
||||||
|
|
||||||
|
Update the docs.starlingx.io for How To use intel-ethernet-operator to
|
||||||
|
configure ethernet cards.
|
||||||
|
|
||||||
|
Building images
|
||||||
|
---------------
|
||||||
|
|
||||||
|
Intel Ethernet Operator team would like to redirect building of UFT container
|
||||||
|
image to StarlingX. Source code of the image is publicly available, and we
|
||||||
|
would provide build scripts. Images of other components would be built and
|
||||||
|
made ready to pull by Intel.
|
||||||
|
|
||||||
|
Dependencies
|
||||||
|
============
|
||||||
|
|
||||||
|
None specific.
|
||||||
|
|
||||||
|
Testing
|
||||||
|
=======
|
||||||
|
|
||||||
|
Testing will be done on a multi node cluster configuration.
|
||||||
|
|
||||||
|
* Testing of packages across several revisions of packages
|
||||||
|
* Validating firmware installs, DDP package installs.
|
||||||
|
* Testing that traffic flow is instantiated to correct pods.
|
||||||
|
* CRDs for particular functionality effect the change on the cluster
|
||||||
|
* Manually deleting / changing the configuration to validate controllers make
|
||||||
|
the changes
|
||||||
|
* reboot of nodes to validate new configuration remains
|
||||||
|
* reload of drivers to validate new configuration remains.
|
||||||
|
|
||||||
|
Documentation Impact
|
||||||
|
====================
|
||||||
|
|
||||||
|
docs.starlingx.io will be updated for:
|
||||||
|
* How to use intel-ethernet-operator application
|
||||||
|
* How to perform enhanced configuration of ethernet devices with the CRDs supplied by Ethernet Operator.
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
Intel® Ethernet Operator - Overview Solution Brief
|
||||||
|
https://networkbuilders.intel.com/solutionslibrary/intel-ethernet-operator-overview-solution-brief
|
||||||
|
|
||||||
|
Intel Ethernet Operator
|
||||||
|
https://github.com/intel/intel-ethernet-operator
|
||||||
|
|
||||||
|
Unified Flow Tool (UFT)
|
||||||
|
https://github.com/intel/UFT/tree/main
|
||||||
|
|
||||||
|
Intel Ethernet 810 series features
|
||||||
|
https://www.intel.com/content/www/us/en/products/details/ethernet/800-controllers/e810-controllers/docs.html
|
||||||
|
|
||||||
|
Node Feature Discovery
|
||||||
|
https://github.com/kubernetes-sigs/node-feature-discovery
|
||||||
|
|
||||||
|
SR-IOV Network Operator
|
||||||
|
https://github.com/k8snetworkplumbingwg/sriov-network-operator
|
||||||
|
|
||||||
|
SR-IOV Network Device Plugin for Kubernetes
|
||||||
|
https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin
|
||||||
|
|
||||||
|
History
|
||||||
|
=======
|
||||||
|
|
||||||
|
.. list-table:: Revisions
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - 02-Feb-2023
|
||||||
|
- Introducing Ethernet operator
|
||||||
|
* - 02-Feb-2023
|
||||||
|
- Updated with comments from StarlingX Sub-Project Meeting
|
||||||
|
* - 03-Mar-2023
|
||||||
|
- Submission
|
||||||
|
* - 29-Mar-2023
|
||||||
|
- Updated with comments from StarlingX Sub-Project Meeting
|
||||||
|
* - 22-Jun-2023
|
||||||
|
- Updated with comments from code reviews
|
Loading…
Reference in New Issue