Merge "Add Redfish support to Maintenance"
This commit is contained in:
commit
3bb3080439
|
@ -0,0 +1,304 @@
|
||||||
|
==================================
|
||||||
|
Add Redfish support to Maintenance
|
||||||
|
==================================
|
||||||
|
|
||||||
|
Storyboard: https://storyboard.openstack.org/#!/story/2005861
|
||||||
|
|
||||||
|
This story adds ``Redfish Platform Management`` support to Starling-X
|
||||||
|
Maintenance as a prioritized alternative to the existing less secure
|
||||||
|
IPMI support for the following board management functions
|
||||||
|
|
||||||
|
* Reset and Power On/Off Control
|
||||||
|
* Network Boot Override
|
||||||
|
* Sensor Monitoring
|
||||||
|
|
||||||
|
Problem description
|
||||||
|
===================
|
||||||
|
|
||||||
|
Starling-X Maintenance currently uses ``ipmitool`` to invoke board management
|
||||||
|
functions. Unfortunately however, IPMI is aged and not evolving with the server
|
||||||
|
market.
|
||||||
|
|
||||||
|
``Redfish`` is a new and emerging well-defined Platform Management Application
|
||||||
|
Programming Interface (API) standard that leverages modern software, is more
|
||||||
|
secure and is easier to use and understand compared to IPMI.
|
||||||
|
|
||||||
|
Redfish API uses the HTTP protocol over a TCP/IP network using either JSON
|
||||||
|
or XML data schemas to leverage common Internet and web services standards
|
||||||
|
and modern tool chains to add new board management services for modern
|
||||||
|
host servers to meet today's system administrator demands.
|
||||||
|
|
||||||
|
Redfish offers a single root endpoint that expands to reveal a well-structured
|
||||||
|
hierarchy of service, system, chassis and management endpoints accessed in
|
||||||
|
user sessions and or single shot command operations to manage and monitor the
|
||||||
|
hardware in polled and event driven models.
|
||||||
|
|
||||||
|
Use Cases
|
||||||
|
---------
|
||||||
|
|
||||||
|
System developers, testers, operators, administrators and auto provisioning
|
||||||
|
tools need the ability to power on, power off and reset hosts as well as
|
||||||
|
force hosts to boot from the network during installation activities.
|
||||||
|
|
||||||
|
High availability products such as Starling-X also need the ability to monitor
|
||||||
|
the health of its host server pool so that it can notify system administrators
|
||||||
|
or system orchestrators of pending or immediate service affecting hardware
|
||||||
|
failures for proactive action and service migrations.
|
||||||
|
|
||||||
|
Proposed change
|
||||||
|
===============
|
||||||
|
|
||||||
|
Maintenance shall continue with the existing centralized power/reset control
|
||||||
|
and sensor monitoring model.
|
||||||
|
|
||||||
|
Integrate BSD licenced Redfish tool into the load and use it similar to how
|
||||||
|
ipmitool is used today which launches a thread that runs ``ipmitool`` as a
|
||||||
|
system command with hidden credentials and reports execution status to the
|
||||||
|
main process as a json string.
|
||||||
|
|
||||||
|
Maintain the existing ipmitool solution for hosts that do not support redfish.
|
||||||
|
|
||||||
|
A common redfish root query will be implemented and called upon BMC
|
||||||
|
provisioning notification to Maintenance (mtcAgent) and the Hardware
|
||||||
|
Monitor (hwmond).
|
||||||
|
|
||||||
|
If that query indicates support for ``Redfish`` then all BMC access to that
|
||||||
|
host will be done using the new Redfish tool and managed by the associated
|
||||||
|
content added by this feature. Otherwise, current ipmitool method will be used.
|
||||||
|
This way Redfish management takes priority over IPMI.
|
||||||
|
|
||||||
|
Aside from work to integrate Redfish tool into the load, all changes for this
|
||||||
|
feature update are restricted to two maintenance daemons ; ``mtcAgent`` and
|
||||||
|
``hwmond``.
|
||||||
|
|
||||||
|
The implementation model for this Redfish support follows what is currently
|
||||||
|
done for ipmitool. For each request, launch the tool thread to run the system
|
||||||
|
command that makes the Redfish request followed by interpreting the response
|
||||||
|
and passing pertinent data back to the main process in a formatted json string.
|
||||||
|
|
||||||
|
There are very little change to the main mtcAgent and hwmond processes.
|
||||||
|
There are no changes to Starling-X System Inventory (sysinv).
|
||||||
|
There are no changes to BMC provisioning.
|
||||||
|
|
||||||
|
Alternatives
|
||||||
|
------------
|
||||||
|
|
||||||
|
An alternative to using the opensource Redfishtool is to implement an HTTP
|
||||||
|
agent that conforms to the DMTF Redfish Scalable Platforms Management API
|
||||||
|
Specification (DSP0266) with the ability to initiate and handle success and
|
||||||
|
failure responses for System Reset, System setBootOverride as well as Chassis
|
||||||
|
Power and Thermal targets for sensor monitoring.
|
||||||
|
|
||||||
|
Such agent would require a back-end interface that the Starling-X Maintenance
|
||||||
|
and Hardware Monitor processes could bind into for orchestration purposes.
|
||||||
|
|
||||||
|
The work involved to implement this alternative is extensive and could require
|
||||||
|
ongoing updates as the Redfish API evolves.
|
||||||
|
|
||||||
|
Data model impact
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
If a host represents its sensors differently in name or type between its
|
||||||
|
ipmi and redfish services then the sensor model for that host may have to
|
||||||
|
be relearned.
|
||||||
|
|
||||||
|
Fortunately the Hardware Monitor already supports a sensor model relearn
|
||||||
|
function in support of BMC and SDR firmware upgrade but also serves feature
|
||||||
|
patch cases as well.
|
||||||
|
|
||||||
|
The sensor model relearn is
|
||||||
|
|
||||||
|
* automatic over a ``hwmond`` process restart if the detected model differs
|
||||||
|
from the model stored in system inventory.
|
||||||
|
* manual using the ``system host-sensorgroup-relearn`` CLI command or by
|
||||||
|
pressing the relearn button on the Host's Sensor tab in Horizon.
|
||||||
|
|
||||||
|
REST API impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
None. This story does not change any existing REST APIs.
|
||||||
|
|
||||||
|
Security impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
A primary design goal in the development of Redfish was to offer improved
|
||||||
|
platform management security compared to existing solutions such as IPMI.
|
||||||
|
|
||||||
|
Redfish API supports two authentication methods
|
||||||
|
|
||||||
|
* Basic Authentication
|
||||||
|
* Token Authentication
|
||||||
|
|
||||||
|
This feature makes its sparse and infrequent requests using Basic
|
||||||
|
authentication. Token authentication adds complexity with no justification.
|
||||||
|
|
||||||
|
Security features built into Redfish are described in the Redfish Scalable
|
||||||
|
Platforms Management API Specification ;
|
||||||
|
https://www.dmtf.org/sites/default/files/standards/documents/DSP0266_1.6.0.pdf
|
||||||
|
|
||||||
|
American Department of Homeland Security warns of the security vulnerabilities
|
||||||
|
of IPMI ; https://www.us-cert.gov/ncas/alerts/TA13-207A
|
||||||
|
|
||||||
|
Other end user impact
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
Performance Impact
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Any performance impact by the introduction of this feature is negligible
|
||||||
|
for the following reasons:
|
||||||
|
|
||||||
|
* the current method uses ipmitool while this feature uses redfishtool in a
|
||||||
|
very similar way.
|
||||||
|
* both methods invoke the tool as a thread to avoid blocking the main process.
|
||||||
|
* maintenance actions are rare, on-demand only and while the host is locked.
|
||||||
|
* sensor monitoring is periodic with a cadence in minutes not seconds.
|
||||||
|
* only impact would be in the difference between the individual two open
|
||||||
|
source tools and prototype testing demonstrated comparable performances.
|
||||||
|
* measured both ipmitool and redfishtool command execution with ``time``
|
||||||
|
and found them to be comparable.
|
||||||
|
|
||||||
|
Other deployer impact
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
This feature introduces a new RPM ; redfishtool.
|
||||||
|
If this feature were to be patched back to an earlier release then that
|
||||||
|
redfishtool RPM would also have to be patched back.
|
||||||
|
|
||||||
|
If this feature is patched back to an earlier release or patched into a
|
||||||
|
current release then
|
||||||
|
* the mtcAgent process would have to be restarted.
|
||||||
|
* the hwmond process would have to be restarted.
|
||||||
|
|
||||||
|
Developer impact
|
||||||
|
----------------
|
||||||
|
|
||||||
|
This feature has no impact to other developers working on StarlingX.
|
||||||
|
|
||||||
|
Upgrade impact
|
||||||
|
--------------
|
||||||
|
|
||||||
|
None currently as this is the initial implementation of Redfish support.
|
||||||
|
|
||||||
|
Newer versions of Redfishtool can be introduced if integration testing of that
|
||||||
|
newer version verifies that the currently used command line options and relied
|
||||||
|
upon underlying behavior passes the test cases listed in the ``Testing``
|
||||||
|
section below.
|
||||||
|
|
||||||
|
If a newer version of redfishtool is required and has functionally impacting
|
||||||
|
changes then maintenance will have to query the redfishtool version and behave
|
||||||
|
as required by the detected version. 'redfishtool -V' prints the redfish tool
|
||||||
|
version.
|
||||||
|
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
==============
|
||||||
|
|
||||||
|
Assignee(s)
|
||||||
|
-----------
|
||||||
|
|
||||||
|
Primary assignee:
|
||||||
|
Eric MacDonald
|
||||||
|
|
||||||
|
Other contributors:
|
||||||
|
Zhipeng Liu
|
||||||
|
|
||||||
|
Repos Impacted
|
||||||
|
--------------
|
||||||
|
|
||||||
|
* stx-integ - adding redfishtool
|
||||||
|
* stx-metal - updating mainteance with redfish support
|
||||||
|
|
||||||
|
Work Items
|
||||||
|
----------
|
||||||
|
|
||||||
|
redfish - stx-integ/bmc/Redfishtool
|
||||||
|
|
||||||
|
* create patched RPM package and include on controllers
|
||||||
|
* create patch that adds unimplemented cfgFile support for hiding credentials.
|
||||||
|
* push cfgFile support upstream.
|
||||||
|
* create patch that makes redfishtool support python-2 and then removed once
|
||||||
|
Starling-X supports python-3
|
||||||
|
|
||||||
|
Maintenance Common - stx-metal/mtce-common/src/common
|
||||||
|
|
||||||
|
* create common redfishUtil.cpp/.h for similar purpose/function to the
|
||||||
|
existing ipmiUtil.cpp/h for use with both hwmond and mtcAgent.
|
||||||
|
|
||||||
|
Maintenance - stx-metal/mtce/src/maintenance - mtcAgent process
|
||||||
|
|
||||||
|
* create mtcRedFishUtil.cpp/h for similar purpose/function to the existing
|
||||||
|
mtcIpmiUtil.cpp/h for sending and receiving RedFishTool requests for
|
||||||
|
maintenance power reset and control, power status and hw/fw version query.
|
||||||
|
* enhance mtcThread.cpp/h with mtcThread_redfishtool request support similar
|
||||||
|
to the existing mtcThread_ipmitool thread used to handle redfish tool
|
||||||
|
requests and responses as a thread.
|
||||||
|
|
||||||
|
Hardware Monitor - stx-metal/mtce/src/hwmon - hwmond process
|
||||||
|
|
||||||
|
* create hwmonRedFish.cpp/h for similar purpose/function to the existing
|
||||||
|
hwmonIpmi.cpp/h for parsing sensor query responses into a common format
|
||||||
|
for the hardware monitor sensor manager engine.
|
||||||
|
* enhance hwmonThreads.cpp/h with new hwmonThread_redfishtool request support
|
||||||
|
similar to the existing mtcThread_ipmitool pthread.
|
||||||
|
|
||||||
|
Dependencies
|
||||||
|
============
|
||||||
|
|
||||||
|
This specification depends upon the open source Redfishtool.
|
||||||
|
|
||||||
|
https://github.com/DMTF/Redfishtool
|
||||||
|
|
||||||
|
Testing
|
||||||
|
=======
|
||||||
|
|
||||||
|
This feature can be tested in a fully provisioned duplex Starling-X system
|
||||||
|
with Redfish supported hosts that have their BMC provisioned through system
|
||||||
|
inventory.
|
||||||
|
|
||||||
|
* With a host's BMC provisioned, verify that the mtcAgent and hwmond processes
|
||||||
|
on the active controller each report a log stating that the UUT host is
|
||||||
|
being managed by Redfish ; rather than IPMI.
|
||||||
|
* With UUT host locked, perform Reset action and verify the host
|
||||||
|
experiences a graceful shutdown followed by a reboot.
|
||||||
|
* With UUT host locked and online, perform Power-Off action and verify the
|
||||||
|
host experiences a graceful shutdown followed by a power-off.
|
||||||
|
* With UUT host locked and powered off, perform power-on action and verify
|
||||||
|
the host powers on and starts to boot.
|
||||||
|
* With UUT host locked and powered off with a bootable image on disk, perform
|
||||||
|
a ReInstall action and verify that the host gets powered on and reinstalls
|
||||||
|
a new image from the controller.
|
||||||
|
* With UUT verify sensor monitoring by viewing the sensor groups and sensors
|
||||||
|
list from Horizon with CLI commands.
|
||||||
|
|
||||||
|
Documentation Impact
|
||||||
|
====================
|
||||||
|
|
||||||
|
This feature change has no customer visible impact.
|
||||||
|
This feature change requires no customer documentation update.
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
Redfish was developed by DTMF (Distributed Management Task Force), lead by a
|
||||||
|
diverse board of directors and contributors from many of the major technology
|
||||||
|
companies like Intel, Dell, HP, Hitachi, Lenovo, Vmware, etc.
|
||||||
|
|
||||||
|
Redfish Platform Management Application Programming Interface (API) standard
|
||||||
|
and supporting specifications can be found at the following URL.
|
||||||
|
|
||||||
|
https://www.dmtf.org/standards/redfish
|
||||||
|
|
||||||
|
|
||||||
|
History
|
||||||
|
=======
|
||||||
|
|
||||||
|
.. list-table:: Revisions
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - Release Name
|
||||||
|
- Description
|
||||||
|
* - 2019.11
|
||||||
|
- Introduced
|
Loading…
Reference in New Issue