Integration of Intel Ethernet Operator to StarlingX Platform ============================================================ Storyboard: https://storyboard.openstack.org/#!/story/2010562 In a cloud environment, network interface adapters require a cloud based management system. These adapters may have advanced functionality which is best combined under a single operator. Problem description =================== Firmware for Network Adapters may need management. Network Adapter personalization may need management. Activation of flow rules to ensure interfaces reach pods is also required. Use Cases --------- * Update of Firmware on interface adapters * Update of Device Dynamic Personalization on interface adapters * Update of Flow Configuration on interface adapters to allow steering Dynamic Device Personalization (DDP) is the on-chip programmable pipeline which allows deep and diverse protocol header processing. Flow configuration allow the steering of traffic to particular VFs on the node. Proposed change =============== The Intel Ethernet Operator (IEO) will allow Intel E810 Series NICs firmware to be updated in a container environment. Nodes will be drained, taken out of service and restarted as required by the update. Firmware and DDP packages can be downloaded from a suitable HTTP Server (configurable in EthernetNodeConfig Custom Resource). Intel Ethernet Operator also requires some other plugins and operators: Intel Ethernet operator requires SR-IOV Device Plugin which makes SR-IOV resources available in kubernetes. For ease of configuration SR-IOV Network Operator is also required. SR-IOV Network Operator requires the use of Node Feature Discovery. Both SR-IOV Network Operator and Node Feature Discovery are installed as dependencies along IEO in intel-ethernet-operator namespace. Flow rules require the inclusion of the (Unified Flow Tool) UFT server application. UFT applies the flow rules and is called using the DPDK rte_flow API. The API supports the Switch Filter rules, supported by DPDK rte_flow. UFT is included as part of the IEO installation. Common features --------------- Within the ethernet operator on the control node, the controller deploys the first asset (ethernet-discovery = labeler) - this is being deployed as a pod on each node, the labeler marks the node if a supported device is connected. The controller deploys a compatibility map - a config file specifying which FW/DDP/Kernel versions can work together. The controller deploys the Ethernet-daemon (FW/DDP daemon) as a Daemonset, nodes with appropriate label get a pod deployed on them, others don't. The Ethernet daemon checks for Node configuration, if one is not found it creates it. The Daemon reconciles in a loop, gathers the status of the required components (found devices, PCI address, MAC, FW, DDP version etc) and updates the Node configuration with a status. User can now get a status of all Node configs, and status of a specific Node config. Firmware and DDP upgrade ------------------------ User uploads desired DDP package and/or nvmupdate package to a HTTP server accessible by the cluster (the HTTP server and mechanism to upload are out of scope of operator). User can now apply a new cluster configuration with preferred configuration, this is broken down by the Ethernet Controller into smaller Node configurations, the configurations are updated. The Ethernet Daemon reconciles in a loop for an update, if condition (fields in applied EthernetClusterConfig CRD) is unchanged it ignores, if new conditions for other nodes are detected it ignores them, when a condition change is detected for particular daemon it acts on it, it will verify the condition and deny change if it cannot be met. If condition can be met it will run appropriate functions/actions to update the node to the desired condition (ie DDP/FW update) - it will try to download packages from specified address from HTTP server, it will elect a leader to act as a controller, it will cordon off and drain the node, it will proceed with updates, it will reboot the node, uncordon it and release the leadership. Once any update to configuration is done, it will update the node configuration status. Once the update is finished the user is able to get the status of the update and status of the node. Flow Configuration ------------------ To allow the Flow Configuration feature to compose the flow rules for the network card's traffic, the deployment must use a trusted virtual function (VF) from each physical function (PF). Usually it is the first VF (VF0) for each PF that has trust mode enabled and then bound to the vfio-pci driver. This VF pool must be created by the user and be allocatable as a Kubernetes resource. Rules can be written like rte_flow and will allow deep matching of packet type flows to interfaces associated with pods on a cluster. Rules can be written for cluster and pod. During pod scheduling they will be instantiated on a node to configure the flow offload hardware on interface to target a pod attached via a particular VF. Alternatives ============ It's possible to connect to each node, untar and install the firmware and Device Profiles. Similarly, flow offloads could possibly be done individually on each node. Data model impact ================= IEO introduces following CRDs on the cluster: - EthernetClusterConfig - FlowConfigNodeAgentDeployment - NodeFlowConfig - ClusterFlowConfig - EthernetNodeConfig (NICs configuration status, not created by user) EthernetClusterConfig ===================== .. code-block:: yaml apiVersion: ethernet.intel.com/v1 kind: EthernetClusterConfig metadata: name: config spec: nodeSelectors: kubernetes.io/hostname: deviceSelector: pciAddress: "" deviceConfig: fwURL: "" fwChecksum: "" ddpURL: "" ddpChecksum: "" Parameters ---------- * ``name``: Name of the specific config * ``kubernetes.io/hostname``: Hostname containing cards to be updated * ``fwURL``: Accessible URL for the file. Proxy may be needed * ``fwChecksum``: Expected checksum of the firmware file * ``ddpURL``: Accessible URL for the DDP file. Proxy may be needed * ``fwChecksum``: Expected checksum of the DDP file FlowConfigNodeAgentDeployment ============================= .. code-block:: yaml apiVersion: flowconfig.intel.com/v1 kind: FlowConfigNodeAgentDeployment metadata: labels: control-plane: flowconfig-daemon name: flowconfig-daemon-deployment namespace: intel-ethernet-operator spec: DCFVfPoolName: openshift.io/cvl_uft_admin NADAnnotation: sriov-cvl-dcf Parameters ---------- * ``name``: Name of the FlowConfigNodeAgentDeployment * ``DCFVfPoolName``: Used SriovNetworkNodePolicy name * ``NADAnnotation``: Used SriovNetwork name NodeFlowConfig =============== .. code-block:: yaml apiVersion: flowconfig.intel.com/v1 kind: NodeFlowConfig metadata: name: worker-01 spec: rules: - pattern: - type: RTE_FLOW_ITEM_TYPE_ETH - type: RTE_FLOW_ITEM_TYPE_IPV4 spec: hdr: src_addr: 10.56.217.9 mask: hdr: src_addr: 255.255.255.255 - type: RTE_FLOW_ITEM_TYPE_END action: - type: RTE_FLOW_ACTION_TYPE_DROP - type: RTE_FLOW_ACTION_TYPE_END portId: 0 attr: Parameters ---------- * ``name``: Name of the config - needs to match node name * ``pattern: type``: Header part to match on * ``pattern: spec & mask``: Addresses to match for the rules * ``action``: Alter the fate of matching traffic, its contents or properties * ``attr``: Flow rule priority level * ``portID``: Information to identify port on a node ClusterFlowConfig ================= .. code-block:: yaml apiVersion: flowconfig.intel.com/v1 kind: ClusterFlowConfig metadata: name: pppoes-sample spec: rules: - pattern: - type: RTE_FLOW_ITEM_TYPE_ETH - type: RTE_FLOW_ITEM_TYPE_IPV4 spec: hdr: src_addr: 10.56.217.9 mask: hdr: src_addr: 255.255.255.255 - type: RTE_FLOW_ITEM_TYPE_END action: - type: to-pod-interface conf: podInterface: net1 attr: ingress: 1 priority: 0 podSelector: matchLabels: app: vagf role: controlplane Parameters ---------- * ``name``: Name of the config * ``pattern: type``: Header part to match on * ``pattern: spec & mask``: Addresses to match for the rules * ``action``: Alter the fate of matching traffic, its contents or properties * ``attr``: Flow rule priority level * ``podSelector``: Labels associated with the particular pod NOTE: Most of the objects parameters names are consistent with the names given in the official dpdk rte flow documentation. For the full description of Generic flow API see https://doc.dpdk.org/guides/prog_guide/rte_flow.html. During the course of execution, ClusterFlowConfig rules are broken down to NodeFlowConfig rules. NodeFlowConfig rules can also be written manually. REST API impact --------------- Standard extension of K8s APIs based on introduction of above CRDs. Security impact --------------- Current/Existing K8S Authentication and Authorization apply to standard extension of K8S APIs based on introduction of IEO CRDs. Other end user impact --------------------- End user will have the capability to: - control firmware and DDP packages - configure flow rules - display configuration status on intel ethernet devices. Performance Impact ------------------ Using the Intel Ethernet Operator, service pods will be running on master and worker nodes all the time which will consume some amount of CPU and memory resource from cluster housekeeping, which we believe to be negligible. For a periodic reconciling, communication between controller-manager and node daemons may consume network resources as well, assuming negligible. Other deployer impact --------------------- None. Developer impact ---------------- In StarlingX 8.0 and future releases /lib/firmware directory is read-only. This creates problem for any customer that would want to use DDP profile other than one that comes preinstalled. Intel ice driver looks for DDP package named intel/ice/ddp/stx-ice.pkg in default firmware search paths - which are /lib/firmware and /lib/firmware/updates. Both of these paths are immutable so currently there is no way to change DDP package in use. Solution to this is alternate firmware search path that is already in kernel https://docs.kernel.org/driver-api/firmware/fw_search_path.html. This feature can be enabled by adding suitable boot parameter. Contribution that adds that to StarlingX is already made (in stx-puppet repository). Upgrade impact -------------- None. This is an optional operator. Implementation ============== Assignee(s) ----------- Primary assignee: Rafal Lal Other contributors: Kevin Clarke Repos Impacted -------------- A new system-application repo will be created for the definition and building of the intel-ethernet-operator application. Work Items ---------- Create intel-ethernet-operator application package Integrate intel-ethernet-operator application to FluxCD. Add application upload/apply/remove/delete commands. Update the docs.starlingx.io for How To use intel-ethernet-operator to configure ethernet cards. Building images --------------- Intel Ethernet Operator team would like to redirect building of UFT container image to StarlingX. Source code of the image is publicly available, and we would provide build scripts. Images of other components would be built and made ready to pull by Intel. Dependencies ============ None specific. Testing ======= Testing will be done on a multi node cluster configuration. * Testing of packages across several revisions of packages * Validating firmware installs, DDP package installs. * Testing that traffic flow is instantiated to correct pods. * CRDs for particular functionality effect the change on the cluster * Manually deleting / changing the configuration to validate controllers make the changes * reboot of nodes to validate new configuration remains * reload of drivers to validate new configuration remains. Documentation Impact ==================== docs.starlingx.io will be updated for: * How to use intel-ethernet-operator application * How to perform enhanced configuration of ethernet devices with the CRDs supplied by Ethernet Operator. References ========== IntelĀ® Ethernet Operator - Overview Solution Brief https://networkbuilders.intel.com/solutionslibrary/intel-ethernet-operator-overview-solution-brief Intel Ethernet Operator https://github.com/intel/intel-ethernet-operator Unified Flow Tool (UFT) https://github.com/intel/UFT/tree/main Intel Ethernet 810 series features https://www.intel.com/content/www/us/en/products/details/ethernet/800-controllers/e810-controllers/docs.html Node Feature Discovery https://github.com/kubernetes-sigs/node-feature-discovery SR-IOV Network Operator https://github.com/k8snetworkplumbingwg/sriov-network-operator SR-IOV Network Device Plugin for Kubernetes https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin History ======= .. list-table:: Revisions :header-rows: 1 * - 02-Feb-2023 - Introducing Ethernet operator * - 02-Feb-2023 - Updated with comments from StarlingX Sub-Project Meeting * - 03-Mar-2023 - Submission * - 29-Mar-2023 - Updated with comments from StarlingX Sub-Project Meeting * - 22-Jun-2023 - Updated with comments from code reviews