Commit Graph

17 Commits

Author SHA1 Message Date
Davi Frossard 6d4e2681a0 Merge sysinv_fpga_agent with sysinv_agent
Merging sysinv-fpga-agent service with sysinv-agent
in order to reduce overall OS overhead.

Replaced calls "wait_for_n3000_reset()" and "wait_for_host_uuid()" in
previous fpga-agent-manager by checks that ensure fpga devices are
reset and host_uuid is available in agent-manager. Also, the content of
"fpga_pci_update()" and "report_fpga_inventory()" methods is directly
inserted in the body of "agent_audit()" method.

Test Plan:

On AIO-DX env (CentOS):
<sysinv-fpga-agent tests>
PASS: Check FPGA pod and its resources.
PASS: Check FPGA pod and its resources after lock/unlock.
PASS: Check FPGA pod and its resources after the system reboot.
PASS: Verify image upload with non-functional image with
retimer-included
PASS: Verify retimer_a_version and retimer_b_version after applying
BMC image with re-timer and bmc
PASS: Verify firmware update for BMC and retimer image with
retimer-include=False
PASS: Verify apply BMC image without re-timer first and then BMC
image with re-timer, only latest image is kept in
device-image-state-list
PASS: Test accelerator configuration is persistent after lock/unlock.
PASS: Test to verify that the accelerator configuration is persistent
after a graceful reboot.

<sysinv-agent tests>
PASS: Verify alarms raised by PTP feature
PASS: Verify the configuration and run of single ptp-instance
PASS: Verify the configuration and run of single phc2sys
PASS: Verify PTP CLI commands

On AIO-SX env (Debian):
PASS: Check FPGA pod and its resources.
PASS: Check FPGA pod and its resources after lock/unlock.
PASS: Check FPGA pod and its resources after system reboot.
PASS: Check if FPGA device can be detected, configured.
PASS: Test accelerator configuration is persistent after lock/unlock.
PASS: Test to verify that the accelerator configuration is persistent
after graceful reboot.

Story: 2010087
Task: 45628

Signed-off-by: Davi Frossard <dbarrosf@windriver.com>
Change-Id: I83edd261898498344001ca90bb53a5f65e66728c
2022-10-03 14:12:28 -04:00
Sabeel Ansari 0fa425fd94 Add cert-alarm service
This is a new service being added to alarm expiring certificates.
The service will run on all active controllers.

Initial commit is just the skeleton structure. Tested by installing
an ISO image and new service comes up under 'smc service-list'.

Story: 2008946
Task: 42852

Depends-on: https://review.opendev.org/c/starlingx/ha/+/801118
Depends-on: https://review.opendev.org/c/starlingx/stx-puppet/+/801117

Signed-off-by: Sabeel Ansari <Sabeel.Ansari@windriver.com>
Change-Id: I8ee12db6903bd2d29230e4adf9f1823b3d4655ee
2021-07-22 08:29:23 -04:00
Bin Qian 6acd2e3564 Single puppet manifest for AIO controllers
Create a single puppet manifest for AIO controllers.
This change includes:
1. remove workerconfig from an AIO controller deployment
2. running puppet based on subfunctions of the nodes

Depends-on: https://review.opendev.org/c/starlingx/stx-puppet/+/780600
Partial-Bug: 1918139
Signed-off-by: Bin Qian <bin.qian@windriver.com>
Change-Id: Ie3693219e3c19460ac5b617cc216cbc809ec2403
2021-04-14 22:05:55 -04:00
Bin Qian 8df382b256 Add cert-mon service
Add new certificate monitoring service.
This is a service to perform monitoring certificates of
admin endpoint,
admin endpoint subcloud intermediate CA, and
admin endpoint DC root CA.
The certificates are managed and renewed by cert-manager.
This change includes monitoring admin endpoint certificate and
apply the new certificate (crt+key) to be used by haproxy for
admin endpoint https.
admin endpoint certificate renew will also replace the private
key. The implementation is a workaround to delete the secret
so that cert-manager regenerate the certificate with new private
key. Currently cert-manager has a bug preventing rekey when
renewing cert.

Monitoring of intermediate CA and DC root CA will be coming soon.

Passed TCs:
1. provisioned cert-mon service on system controller and subcloud
   controller, successfully swact

2. simulate endpoint certificate renew by shorten the endpoint
   certificate expiry time.
   observed the certificate (/etc/ssl/private/admin-ep-cert.pem)
   updated.
   verify admin endpoints accessible (local or remotely)
   verify admin endpoints accessible after haproxy restart

3. simulate an action to fail (hardcoded) and observe the action
   being configured number reattempted before giving up.

Story: 2007347
Task: 40168

Depends-on https://review.opendev.org/#/c/739890
Depends-on https://review.opendev.org/#/c/741511
Depends-on https://review.opendev.org/#/c/741993
Change-Id: Ie341e2e4896c291b7485e95c89c5c3f370ffea00
2020-07-20 14:06:31 -04:00
Chris Friesen 152604297d sysinv FPGA agent initial commit
This creates a new sysinv FPGA agent.  On startup it will perform
an initial scan for supported FPGA devices and report the current
hardware status to sysinv-conductor via RPC.

It also provides basic support for flashing the specified device
images to the FPGA device using Intel-supplied tools running
in a Docker container.

Initially only the Intel N3000 FPGA is supported.

Story: 2006740
Task:  39927
Change-Id: Id8a6510a2d8cd072737a98c5d909f94dbf10a763
Depends-On: I63cfa7698285a1a43f1e9e4b98e9a536fc3dc682
2020-06-07 23:54:01 -06:00
Scott Little 7b58c19c5a Config file changes for packages relocated to repo 'platform-armada-app'
List of relocated subdirectories:

kubernetes/platform/stx-platform/stx-platform-helm
kubernetes/helm-charts/ceph-pools-audit
kubernetes/helm-charts/rbd-provisioner

Story: 2006166
Task: 35687
Depends-On: I00f54876e7872cf0d3e4f5e8f986cb7e3b23c86f
Depends-On: I665dc7fabbfffc798ad57843eb74dca16e7647a3
Change-Id: Ibca91cc733e27cd9fb4926b7151cfa8a7976a59d
Signed-off-by: Scott Little <scott.little@windriver.com>
2019-09-05 11:52:28 -04:00
Scott Little 228d8917b8 Config file changes for packages relocated to repo 'stx-puppet'
List of relocated subdirectories:

puppet-manifests
puppet-modules-wrs/puppet-dcdbsync
puppet-modules-wrs/puppet-dcmanager
puppet-modules-wrs/puppet-dcorch
puppet-modules-wrs/puppet-fm
puppet-modules-wrs/puppet-mtce
puppet-modules-wrs/puppet-nfv
puppet-modules-wrs/puppet-patching
puppet-modules-wrs/puppet-smapi
puppet-modules-wrs/puppet-sshd
puppet-modules-wrs/puppet-sysinv

Story: 2006166
Task: 35687
Depends-On: I6c62895f8dda5b8dc4ff56680c73c49f3f3d7935
Depends-On: I665dc7fabbfffc798ad57843eb74dca16e7647a3
Change-Id: I00f54876e7872cf0d3e4f5e8f986cb7e3b23c86f
Signed-off-by: Scott Little <scott.little@windriver.com>
2019-09-05 11:52:28 -04:00
Scott Little 217484d4f1 Config file changes to add 'tsconfig' after relocation from 'update'
Story: 2006166
Task: 35687
Depends-On: Ie6fc7b2a185168424cb6158e817b6e240af89d5e
Depends-On: I665dc7fabbfffc798ad57843eb74dca16e7647a3
Change-Id: I6c62895f8dda5b8dc4ff56680c73c49f3f3d7935
Signed-off-by: Scott Little <scott.little@windriver.com>
Depends-On: Ib9f912a6c776512983ce187fdf594b6274d13ce0
2019-09-05 11:51:05 -04:00
Scott Little e68e6b1199 Config file changes for packages relocated to repo 'utilities'
List of relocated subdirectories:

pm-qos-mgr worker-utils

Story: 2006166
Task: 35687
Depends-On: I520d1d7f890f298d59998cb15613efa2233e329a
Depends-On: I665dc7fabbfffc798ad57843eb74dca16e7647a3
Change-Id: Ie6fc7b2a185168424cb6158e817b6e240af89d5e
Signed-off-by: Scott Little <scott.little@windriver.com>
2019-09-05 10:43:21 -04:00
Tee Ngo eb47ee4585 Remove playbookconfig from StarlingX config repo
This commit is part of multi-commit move of Ansible
playbooks to the new repo that hosts playbooks and
artifacts related to deployment of StarlingX.

Tests:
  Install and run bootstrap in simplex and standard systems.

Story: 2004695
Task: 33567
Depends-On: Iddbe2cb8105ede96d29e2a1d4bb29031a36f327f
Change-Id: I60b9bce3f3d23a2316b3a24c48006b71ff3ecd52
Signed-off-by: Tee Ngo <tee.ngo@windriver.com>
2019-06-14 13:38:50 -04:00
Jim Gauld 76b1a7a16f Introduce PM QoS cpu latency manager for kubelet
This creates a daemon 'pm-qos-mgr' that monitors kubelet cpu-manager
static cpu assignments and modifies PM QoS CPU wakeup latency to
govern the C-states per CPU.

Guaranteed pods with exclusive CPUs get "low" cpu wakeup latency
policy (the c-state is capped at C1).

Remaining pods (i.e., with default CPUs) get "high" cpu wakeup latency
policy, so the cpu may go into higher c-state if idle.

Change-Id: I8470217dc53b6a7912b5023d8c0b04d966357222
Closes-Bug: 1830545
Signed-off-by: Jim Gauld <james.gauld@windriver.com>
2019-05-28 22:53:16 -04:00
Robert Church c937123901 Add stx-platform-helm RPM
Provide a new helm RPM that will be included with and installed by the
ISO.

This RPM will package an application tarball that is to be managed by
the system without user intervention but supports all the existing
mechanisms that the user installed applications use.

The intent of this application is to provide any charts required for
platform integration and install any of those charts through an armada
manifest.

The armada manifest will reference a new helm repository that will be
enabled with a future commit. This repository is intended for platform
specific charts installed by RPMs. The existing helm repository is
kept for optional runtime user installed applications.

Change-Id: Ia34d3a1efa1006b184cc35e17afb231064cf36f9
Story: 2005424
Task: 30452
Signed-off-by: Robert Church <robert.church@windriver.com>
2019-04-29 13:35:29 -04:00
Tee Ngo 56275fb5b0 Ansible Bootstrap Deployment
This commit is initial submission of bootstrap playbook which
enables the bootstrap of initial controller. The playbook
defaults are meant for configuring the localhost in vbox
development environment. Custom hosts file and user overrides
are required for configuring multiple hosts and lab specific setup.
Secret file and SSH keys are required for production test enviroment.

Tests performed:
 - installation
 - config_controller complete to ensure the current method of
   configuring the first controller is intact
 - localhost bootstrap with default hosts file
 - multiple remote hosts bootstrap with custom hosts file
 - reconfigurations with user overrides
 - stx-application applied in AIOSX and AIODX
 - Failure & skip play cases (invalid config inputs, incorrect load,
   connection failure, no changes replay, etc...)

TODO:
 - Support for standard & storage configurations
 - Docker proxy/custom registry related tests
 - Package bootstrap playbook in SDK
 - Config_controller cleanup

Change-Id: If553f1eeed32606bacc690ef277e60606e9d93ea
Story: 200476
Task: 29686
Task: 29687
Co-Authored-By: Ovidiu Poncea <ovidiu.poncea@windriver.com>
Signed-off-by: Tee Ngo <tee.ngo@windriver.com>
2019-04-11 08:40:34 -04:00
Kristine Bujold a1e2d1e183 Remove wrs-configutilities SDK Module
Remove configutilities and move what is being used in other components
to controllerconfig.

Tested with a clean install on AIO-DX and running config_controller.

With the StarlingX move to supporting pure upstream OpenStack, the
majority of the SDK Modules are related to functionality no longer
supported. The remaining SDK Modules will be moved to StarlingX
documentation.

Story: 2005275
Task: 30262

Change-Id: Ie496548dfc6efee677a501c98c227c586df0a7d6
Signed-off-by: Kristine Bujold <kristine.bujold@windriver.com>
2019-04-02 11:50:23 -04:00
Tee Ngo e45b3fb204 Include Ansible related packages in the ISO
This commit includes Ansible related packages in the image
in preparation for the controller bootstrap playbook.

Tests done:
  - Successful installation
  - Successful execution of a sample playbook

Story: 2004695
Task: 29379

Change-Id: I6fa27eedcc5af9e6e9ac94722860927dbdcf3383
Signed-off by: Tee Ngo <tee.ngo@windriver.com>
2019-02-06 18:34:11 -05:00
Tao Liu 6256b0d106 Change compute node to worker node personality
This update replaced the compute personality & subfunction
to worker, and updated internal and customer visible
references.

In addition, the compute-huge package has been renamed to
worker-utils as it contains various scripts/services that
used to affine running tasks or interface IRQ to specific CPUs.
The worker_reserved.conf is now installed to /etc/platform.

The cpu function 'VM' has also been renamed to 'Application'.

Tests Performed:
Non-containerized deployment
AIO-SX: Sanity and Nightly automated test suite
AIO-DX: Sanity and Nightly automated test suite
2+2 System: Sanity and Nightly automated test suite
2+2 System: Horizon Patch Orchestration
Kubernetes deployment:
AIO-SX: Create, delete, reboot and rebuild instances
2+2+2 System: worker nodes are unlock enable and no alarms

Story: 2004022
Task: 27013

Change-Id: I0e0be6b3a6f25f7fb8edf64ea4326854513aa396
Signed-off-by: Tao Liu <tao.liu@windriver.com>
2018-12-13 14:15:55 -05:00
Scott Little 63be112368 Split image.inc across git repos
Currently compiling a new package and adding it
to the iso still requires a multi-git update because
image.inc is a single centralized file in the root git.

It would be better to allow a single git update to add
a package. Too allow this, image.inc must be split across
the git repos and the build tools must be changed to
read/merge those files to arrive at the final package list.

Current scheme is to name the image.inc files using this
schema.

${distro}_${build_target}_image_${build_type}.inc

distro = centos, ...
build_target = iso, guest ...
build_type = std, rt ...

Traditionally build_type=std is omitted from config files,
so we instread use ${distro}_${build_target}_image.inc.

Change-Id: I7c594c48ee74f8aceebe52f804b8670379acdf77
Story: 2003447
Task:  24649
Depends-On: Ib39b8063e7759842ba15330c68503bfe2dea6e20
Signed-off-by: Scott Little <scott.little@windriver.com>
2018-08-16 10:08:08 -04:00