specs/specs/2019.03/approved/testing-2006406-performance...

..
  This work is licensed under a Creative Commons Attribution 3.0 Unported
  License. http://creativecommons.org/licenses/by/3.0/legalcode

..

============================================
StarlingX: Performance measurement framework
============================================

Storyboard: https://storyboard.openstack.org/#!/story/2006406

The measurement of performance metrics in edge cloud systems is a key factor in
quality assurance.  Having a common understanding and a practical framework to
run the performance tests are important and necessary for users and developers
in the community. This specification will describe the scope of the performance
framework, use cases and how it can scale by new tests developed by the
community or imported from existing test performance frameworks.

Problem description
===================

The deployment of StarlingX could happen in a wide variability of hardware and
network environments, these differences could result in a difference in
performance metrics. As of today, the community does not have a consolidated
performance framework to measure the performance metrics with specific and
well-defined test cases.

Use Cases
---------

Developers who made some fairly significant changes in codebase want to ensure
no performance regression brought by the changes.  Developers who did some
performance improvements need to measure the performance result with a
consolidating framework valid across the community.  Some of the community
members want to measure out the performance items on StarlingX and promote
StarlingX advantages in terms of performance which are relevant to an edge
solution.  Potential users want to evaluate StarlingX as one of their Edge
solution candidates, so offering such a performance framework will be a
positive factor of contributing to the evaluation process.

Proposed change
===============

The proposed change is to create a set of scripts and documentation under the
starlingx testing repository (https://opendev.org/starlingx/test) that anyone
can use to measure the metric they need under their hardware and software
environment.

Alternatives
------------

The community can use existing performance test frameworks, Some of those are:

* Rally:

OpenStack project, dedicated to the performance analysis and benchmarking
system of individual OpenStack components

* Kubernetes perf-tests:

An open source project dedicated to Kubernetes-related performance test tools

* Yardstick by OPNFV:

The Yardstick concept decomposes typical virtual network function work-load
performance metrics into several characteristics/performance vectors, which
each of them can be represented by distinct test-cases.

The disadvantage of all these frameworks is that they coexist in separate
projects. Users need to go and find the test cases anywhere on the internet, is
not a centralized and scalable solution. The StarlingX Performance measurement
framework proposed will allow the importing of existing test cases to be
re-used in a centralized system easy to use for the community.

Data model impact
-----------------
None

REST API impact
---------------
None

Security impact
---------------
None

Other end-user impact
---------------------

End users will interact with this framework by command line on the controller
system by the performance-test-runner.py script, which is the entry point to
select and configure the test case. An example of to launch a test case might
be:

::
    ./performance-test-runner.py --test failed_vm_detection --compute=0

This will generate results of multiple runs in CSV format easy to post-process.


Performance Impact
------------------
Running this performance framework or test cases, there shouldn’t be a
performance impact or obvious overhead on the system.


Other deployer impact
---------------------
None, there will be no impact on how to deploy and configure StarlingX

Developer impact
----------------

There will be a good impact on the developers. Since they will be able to have
a consolidated and well aligned framework to measure the performance impact of
their code/configuration changes.

Developers can't improve something if they don’t know it needs improvement.
That’s where this performance testing framework comes in. It gives the
developers tools to detect how fast, stable, and scalable their StarlingX, is
so they can decide if it needs to improve or not.

Upgrade impact
--------------
None

Implementation
==============

The test framework will have the next components on the implementation

* Common scripts:

This directory will contain scripts to capture key timestamps or collect other
key data along with some use cases so that developers can measure out the time
(latency) or other performance data (such as throughput, system CPU utilization
and so on). Any common script that any other test case could use should be in
this directory.

Despite the existence of a directory for common scripts, each one of the test
cases is responsible for its own metrics. One example is the time measurement
of test cases that run in more than one node. For these kinds of test cases
will be necessary to use synchronization protocols like the Network Time
Protocol (NTP). The Network Time Protocol (NTP) is a networking protocol for
clock synchronization between computer systems. NTP is intended to synchronize
all participating computers to within a few milliseconds of Coordinated
Universal Time (UTC).In some other cases, it might not be necessary to have a
protocol for clock synchronization.

Most of the details for wether to re-use a common script form the common
directory or handle the measurement by the test itself will be left to the test
implementation

* Performance-test-runner.py:

The main command-line interface to execute the test cases on the system under
test

* Test cases directory:

This directory can contain wrapper scripts to existing upstream performance
test cases as well as new test cases specific for Starlting X.

A view of the directory layout will look like:

::
    performance/
    ├── common
    │   ├── latency_metrics.py
    │   ├── network_generator.py
    │   ├── ntp_metrics.py
    │   └── statistics.py
    └── tests_cases
    │   ├── failed_compute_detection.py
    │   ├── failed_control_detection.py
    │   ├── failed_network_detection.py
    │   ├── failed_vm_detection.py
    │   ├── neutron_test_case_1.py
    │   ├── opnfv_test_case_1.py
    │   ├── opnfv_test_case_2.py
    │   ├── rally_test_case_1.py
    │   ├── rally_test_case_2.py
    │
    └── performance-test-runner.py

The goal is that anyone on the StarlingX community can either define
performance tests cases or re use existing on other projects and create
scripts to measure in a scalable and repeatable framework.

In the future, it might be possible to evaluate how to connect some basic
performance tests cases to zuul to detect regressions on commits to be merged.
This will require to set up a robust infrastructure to run the sanity
performance test cases for each commit to be merged.

Assignee(s)
-----------

Wang Hai Tao
Elio Martinez
Juan Carlos Alonzo
Victor Rodriguez


Repos Impacted
--------------

https://opendev.org/starlingx/test

Work Items
----------
* Develop automated scripts for core test cases, such as:
    * Detection of failed VM
    * Detection failed compute node
    * Controller/Node fail detect/recovery
    * Detection of network link fail
    * DPDK Live Migrate Latency
    * Avg/Max host/guest Latency (Cyclictest)
    * Swact Time

* Develop performance-test-runner.py that call each test case script
* Integrate performance-test-runner.py with pytest framework
* Implement call clone of opnfv test cases
* Implement execution of opnfv test case
* Implement monitoring pipeline to automatically detect changes in upstream
* Automate weekly mail with list of upstream performance test cases that change
* Implement automatic test cases documentation based on pydoc
* Document framework on StarlingX wiki
* Send regular status to ML  and demos on community call for feedback

Dependencies
============
None

Testing
=======
None

Documentation Impact
====================

The performance test cases will be documented with pydoc
(https://docs.python.org/3.0/library/pydoc.html). The pydoc module
automatically generates documentation from Python modules. The documentation
can be presented as pages of text on the console, served to a Web browser, or
saved to HTML files.

The goal is to generate automatically the documentation for the end-user
testing guide. This methodology will catch when the community adds or modify a
new performance test case. At the same time if a standard config option changes
or is deprecated the test framework documentation will be updated to reflect
the change.

References
==========


History
=======

.. list-table:: Revisions
   :header-rows: 1

   * - Release Name
     - Description
   * - Stein
     - Introduced