Merge "Add Intel Ethernet Operator spec"
This commit is contained in:
commit
509f2d6fa8
|
@ -0,0 +1,420 @@
|
|||
Integration of Intel Ethernet Operator to StarlingX Platform
|
||||
============================================================
|
||||
|
||||
Storyboard:
|
||||
https://storyboard.openstack.org/#!/story/2010562
|
||||
|
||||
In a cloud environment, network interface adapters require a cloud based
|
||||
management system. These adapters may have advanced functionality which is
|
||||
best combined under a single operator.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Firmware for Network Adapters may need management. Network Adapter
|
||||
personalization may need management. Activation of flow rules to ensure
|
||||
interfaces reach pods is also required.
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
* Update of Firmware on interface adapters
|
||||
* Update of Device Dynamic Personalization on interface adapters
|
||||
* Update of Flow Configuration on interface adapters to allow steering
|
||||
|
||||
Dynamic Device Personalization (DDP) is the on-chip programmable pipeline
|
||||
which allows deep and diverse protocol header processing. Flow configuration
|
||||
allow the steering of traffic to particular VFs on the node.
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
The Intel Ethernet Operator (IEO) will allow Intel E810 Series NICs
|
||||
firmware to be updated in a container environment. Nodes will be drained,
|
||||
taken out of service and restarted as required by the update. Firmware and
|
||||
DDP packages can be downloaded from a suitable HTTP Server
|
||||
(configurable in EthernetNodeConfig Custom Resource).
|
||||
|
||||
Intel Ethernet Operator also requires some other plugins and operators:
|
||||
|
||||
Intel Ethernet operator requires SR-IOV Device Plugin which makes SR-IOV
|
||||
resources available in kubernetes. For ease of configuration SR-IOV Network
|
||||
Operator is also required. SR-IOV Network Operator requires the use of Node
|
||||
Feature Discovery. Both SR-IOV Network Operator and Node Feature Discovery
|
||||
are installed as dependencies along IEO in intel-ethernet-operator namespace.
|
||||
|
||||
Flow rules require the inclusion of the (Unified Flow Tool) UFT server
|
||||
application. UFT applies the flow rules and is called using the
|
||||
DPDK rte_flow API. The API supports the Switch Filter rules, supported
|
||||
by DPDK rte_flow. UFT is included as part of the IEO installation.
|
||||
|
||||
Common features
|
||||
---------------
|
||||
|
||||
Within the ethernet operator on the control node, the controller deploys the
|
||||
first asset (ethernet-discovery = labeler) - this is being deployed as a pod
|
||||
on each node, the labeler marks the node if a supported device is connected.
|
||||
The controller deploys a compatibility map - a config file specifying which
|
||||
FW/DDP/Kernel versions can work together.
|
||||
|
||||
The controller deploys the Ethernet-daemon (FW/DDP daemon) as a Daemonset,
|
||||
nodes with appropriate label get a pod deployed on them, others don't.
|
||||
The Ethernet daemon checks for Node configuration, if one is not found it
|
||||
creates it.
|
||||
The Daemon reconciles in a loop, gathers the status of the required components
|
||||
(found devices, PCI address, MAC, FW, DDP version etc) and updates the Node
|
||||
configuration with a status.
|
||||
User can now get a status of all Node configs, and status of a specific Node
|
||||
config.
|
||||
|
||||
Firmware and DDP upgrade
|
||||
------------------------
|
||||
|
||||
User uploads desired DDP package and/or nvmupdate package to a HTTP server
|
||||
accessible by the cluster (the HTTP server and mechanism to upload are out of
|
||||
scope of operator). User can now apply a new cluster configuration with
|
||||
preferred configuration, this is broken down by the Ethernet Controller into
|
||||
smaller Node configurations, the configurations are updated.
|
||||
The Ethernet Daemon reconciles in a loop for an update, if condition
|
||||
(fields in applied EthernetClusterConfig CRD) is unchanged it ignores,
|
||||
if new conditions for other nodes are detected it ignores them, when
|
||||
a condition change is detected for particular daemon it acts on it,
|
||||
it will verify the condition and deny change if it cannot be met.
|
||||
If condition can be met it will run appropriate functions/actions to update
|
||||
the node to the desired condition (ie DDP/FW update) - it will try to download
|
||||
packages from specified address from HTTP server, it will elect a leader to
|
||||
act as a controller, it will cordon off and drain the node, it will proceed
|
||||
with updates, it will reboot the node, uncordon it and release the leadership.
|
||||
Once any update to configuration is done, it will update the node
|
||||
configuration status. Once the update is finished the user is able to get the
|
||||
status of the update and status of the node.
|
||||
|
||||
Flow Configuration
|
||||
------------------
|
||||
|
||||
To allow the Flow Configuration feature to compose the flow rules for the
|
||||
network card's traffic, the deployment must use a trusted virtual function
|
||||
(VF) from each physical function (PF). Usually it is the first VF (VF0) for
|
||||
each PF that has trust mode enabled and then bound to the vfio-pci driver.
|
||||
This VF pool must be created by the user and be allocatable as a Kubernetes
|
||||
resource.
|
||||
|
||||
Rules can be written like rte_flow and will allow deep matching of packet type
|
||||
flows to interfaces associated with pods on a cluster. Rules can be written
|
||||
for cluster and pod. During pod scheduling they will be instantiated on
|
||||
a node to configure the flow offload hardware on interface to target a pod
|
||||
attached via a particular VF.
|
||||
|
||||
Alternatives
|
||||
============
|
||||
|
||||
It's possible to connect to each node, untar and install the firmware and
|
||||
Device Profiles. Similarly, flow offloads could possibly be done individually
|
||||
on each node.
|
||||
|
||||
Data model impact
|
||||
=================
|
||||
|
||||
IEO introduces following CRDs on the cluster:
|
||||
- EthernetClusterConfig
|
||||
- FlowConfigNodeAgentDeployment
|
||||
- NodeFlowConfig
|
||||
- ClusterFlowConfig
|
||||
- EthernetNodeConfig (NICs configuration status, not created by user)
|
||||
|
||||
EthernetClusterConfig
|
||||
=====================
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
apiVersion: ethernet.intel.com/v1
|
||||
kind: EthernetClusterConfig
|
||||
metadata:
|
||||
name: config
|
||||
spec:
|
||||
nodeSelectors:
|
||||
kubernetes.io/hostname: <hostname>
|
||||
deviceSelector:
|
||||
pciAddress: "<pci-address>"
|
||||
deviceConfig:
|
||||
fwURL: "<URL_to_firmware>"
|
||||
fwChecksum: "<file_checksum_SHA-1_hash>"
|
||||
ddpURL: "<URL_to_DDP>"
|
||||
ddpChecksum: "<file_checksum_SHA-1_hash>"
|
||||
|
||||
Parameters
|
||||
----------
|
||||
|
||||
* ``name``: Name of the specific config
|
||||
* ``kubernetes.io/hostname``: Hostname containing cards to be updated
|
||||
* ``fwURL``: Accessible URL for the file. Proxy may be needed
|
||||
* ``fwChecksum``: Expected checksum of the firmware file
|
||||
* ``ddpURL``: Accessible URL for the DDP file. Proxy may be needed
|
||||
* ``fwChecksum``: Expected checksum of the DDP file
|
||||
|
||||
FlowConfigNodeAgentDeployment
|
||||
=============================
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
apiVersion: flowconfig.intel.com/v1
|
||||
kind: FlowConfigNodeAgentDeployment
|
||||
metadata:
|
||||
labels:
|
||||
control-plane: flowconfig-daemon
|
||||
name: flowconfig-daemon-deployment
|
||||
namespace: intel-ethernet-operator
|
||||
spec:
|
||||
DCFVfPoolName: openshift.io/cvl_uft_admin
|
||||
NADAnnotation: sriov-cvl-dcf
|
||||
|
||||
Parameters
|
||||
----------
|
||||
|
||||
* ``name``: Name of the FlowConfigNodeAgentDeployment
|
||||
* ``DCFVfPoolName``: Used SriovNetworkNodePolicy name
|
||||
* ``NADAnnotation``: Used SriovNetwork name
|
||||
|
||||
NodeFlowConfig
|
||||
===============
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
apiVersion: flowconfig.intel.com/v1
|
||||
kind: NodeFlowConfig
|
||||
metadata:
|
||||
name: worker-01
|
||||
spec:
|
||||
rules:
|
||||
- pattern:
|
||||
- type: RTE_FLOW_ITEM_TYPE_ETH
|
||||
- type: RTE_FLOW_ITEM_TYPE_IPV4
|
||||
spec:
|
||||
hdr:
|
||||
src_addr: 10.56.217.9
|
||||
mask:
|
||||
hdr:
|
||||
src_addr: 255.255.255.255
|
||||
- type: RTE_FLOW_ITEM_TYPE_END
|
||||
action:
|
||||
- type: RTE_FLOW_ACTION_TYPE_DROP
|
||||
- type: RTE_FLOW_ACTION_TYPE_END
|
||||
portId: 0
|
||||
attr:
|
||||
|
||||
Parameters
|
||||
----------
|
||||
|
||||
* ``name``: Name of the config - needs to match node name
|
||||
* ``pattern: type``: Header part to match on
|
||||
* ``pattern: spec & mask``: Addresses to match for the rules
|
||||
* ``action``: Alter the fate of matching traffic, its contents or properties
|
||||
* ``attr``: Flow rule priority level
|
||||
* ``portID``: Information to identify port on a node
|
||||
|
||||
ClusterFlowConfig
|
||||
=================
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
apiVersion: flowconfig.intel.com/v1
|
||||
kind: ClusterFlowConfig
|
||||
metadata:
|
||||
name: pppoes-sample
|
||||
spec:
|
||||
rules:
|
||||
- pattern:
|
||||
- type: RTE_FLOW_ITEM_TYPE_ETH
|
||||
- type: RTE_FLOW_ITEM_TYPE_IPV4
|
||||
spec:
|
||||
hdr:
|
||||
src_addr: 10.56.217.9
|
||||
mask:
|
||||
hdr:
|
||||
src_addr: 255.255.255.255
|
||||
- type: RTE_FLOW_ITEM_TYPE_END
|
||||
action:
|
||||
- type: to-pod-interface
|
||||
conf:
|
||||
podInterface: net1
|
||||
attr:
|
||||
ingress: 1
|
||||
priority: 0
|
||||
podSelector:
|
||||
matchLabels:
|
||||
app: vagf
|
||||
role: controlplane
|
||||
|
||||
Parameters
|
||||
----------
|
||||
|
||||
* ``name``: Name of the config
|
||||
* ``pattern: type``: Header part to match on
|
||||
* ``pattern: spec & mask``: Addresses to match for the rules
|
||||
* ``action``: Alter the fate of matching traffic, its contents or properties
|
||||
* ``attr``: Flow rule priority level
|
||||
* ``podSelector``: Labels associated with the particular pod
|
||||
|
||||
NOTE: Most of the objects parameters names are consistent with the names given
|
||||
in the official dpdk rte flow documentation. For the full description of
|
||||
Generic flow API see https://doc.dpdk.org/guides/prog_guide/rte_flow.html.
|
||||
|
||||
During the course of execution, ClusterFlowConfig rules are broken down to
|
||||
NodeFlowConfig rules. NodeFlowConfig rules can also be written manually.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
Standard extension of K8s APIs based on introduction of above CRDs.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
Current/Existing K8S Authentication and Authorization apply to standard
|
||||
extension of K8S APIs based on introduction of IEO CRDs.
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
End user will have the capability to:
|
||||
- control firmware and DDP packages
|
||||
- configure flow rules
|
||||
- display configuration status
|
||||
on intel ethernet devices.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
Using the Intel Ethernet Operator, service pods will be running on master
|
||||
and worker nodes all the time which will consume some amount of CPU and memory
|
||||
resource from cluster housekeeping, which we believe to be negligible.
|
||||
For a periodic reconciling, communication between controller-manager and node
|
||||
daemons may consume network resources as well, assuming negligible.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
None.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
In StarlingX 8.0 and future releases /lib/firmware directory is read-only.
|
||||
This creates problem for any customer that would want to use DDP profile
|
||||
other than one that comes preinstalled. Intel ice driver looks for DDP package
|
||||
named intel/ice/ddp/stx-ice.pkg in default firmware search paths - which are
|
||||
/lib/firmware and /lib/firmware/updates. Both of these paths are immutable
|
||||
so currently there is no way to change DDP package in use. Solution to this
|
||||
is alternate firmware search path that is already in kernel
|
||||
https://docs.kernel.org/driver-api/firmware/fw_search_path.html. This feature
|
||||
can be enabled by adding suitable boot parameter. Contribution that adds that
|
||||
to StarlingX is already made (in stx-puppet repository).
|
||||
|
||||
Upgrade impact
|
||||
--------------
|
||||
|
||||
None. This is an optional operator.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
Rafal Lal
|
||||
|
||||
Other contributors:
|
||||
Kevin Clarke
|
||||
|
||||
Repos Impacted
|
||||
--------------
|
||||
|
||||
A new system-application repo will be created for the definition and building
|
||||
of the intel-ethernet-operator application.
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
Create intel-ethernet-operator application package
|
||||
|
||||
Integrate intel-ethernet-operator application to FluxCD. Add application
|
||||
upload/apply/remove/delete commands.
|
||||
|
||||
Update the docs.starlingx.io for How To use intel-ethernet-operator to
|
||||
configure ethernet cards.
|
||||
|
||||
Building images
|
||||
---------------
|
||||
|
||||
Intel Ethernet Operator team would like to redirect building of UFT container
|
||||
image to StarlingX. Source code of the image is publicly available, and we
|
||||
would provide build scripts. Images of other components would be built and
|
||||
made ready to pull by Intel.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
None specific.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Testing will be done on a multi node cluster configuration.
|
||||
|
||||
* Testing of packages across several revisions of packages
|
||||
* Validating firmware installs, DDP package installs.
|
||||
* Testing that traffic flow is instantiated to correct pods.
|
||||
* CRDs for particular functionality effect the change on the cluster
|
||||
* Manually deleting / changing the configuration to validate controllers make
|
||||
the changes
|
||||
* reboot of nodes to validate new configuration remains
|
||||
* reload of drivers to validate new configuration remains.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
docs.starlingx.io will be updated for:
|
||||
* How to use intel-ethernet-operator application
|
||||
* How to perform enhanced configuration of ethernet devices with the CRDs supplied by Ethernet Operator.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
Intel® Ethernet Operator - Overview Solution Brief
|
||||
https://networkbuilders.intel.com/solutionslibrary/intel-ethernet-operator-overview-solution-brief
|
||||
|
||||
Intel Ethernet Operator
|
||||
https://github.com/intel/intel-ethernet-operator
|
||||
|
||||
Unified Flow Tool (UFT)
|
||||
https://github.com/intel/UFT/tree/main
|
||||
|
||||
Intel Ethernet 810 series features
|
||||
https://www.intel.com/content/www/us/en/products/details/ethernet/800-controllers/e810-controllers/docs.html
|
||||
|
||||
Node Feature Discovery
|
||||
https://github.com/kubernetes-sigs/node-feature-discovery
|
||||
|
||||
SR-IOV Network Operator
|
||||
https://github.com/k8snetworkplumbingwg/sriov-network-operator
|
||||
|
||||
SR-IOV Network Device Plugin for Kubernetes
|
||||
https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
.. list-table:: Revisions
|
||||
:header-rows: 1
|
||||
|
||||
* - 02-Feb-2023
|
||||
- Introducing Ethernet operator
|
||||
* - 02-Feb-2023
|
||||
- Updated with comments from StarlingX Sub-Project Meeting
|
||||
* - 03-Mar-2023
|
||||
- Submission
|
||||
* - 29-Mar-2023
|
||||
- Updated with comments from StarlingX Sub-Project Meeting
|
||||
* - 22-Jun-2023
|
||||
- Updated with comments from code reviews
|
Loading…
Reference in New Issue