It was determined that we were missing one line of pmon
configuration that makes PMON not take actions until
worker configuration is complete.
Test pass in AIO-DX
Below log can be seen in pmond.log
pci-irq-affinity-agent monitoring is waiting on
/var/run/.worker_config_complete
Before openstack application finished, no degrade or pci-irq-affinity-
agent restart can be seen.
Closes-bug: 1839525
Change-Id: I3d28ea4afa2f7e65dcc9e48a7a46b4f80c574e3e
Signed-off-by: zhipengl <zhipengs.liu@intel.com>
The filesystem /opt/cgcs is removed and its content moved under
/opt/platform.
Resources related to drbd-cgcs and /opt/cgcs are updated to
drbd-plaform and /opt/plaform.
Tested in AIO-SX, AIO-DX and Standard hardware labs.
Depends-On: https://review.opendev.org/674360
Partial-Bug: 1830142
Change-Id: I6d0555f00ab269f7d9567fff365180b66adce8b3
Signed-off-by: Kristine Bujold <kristine.bujold@windriver.com>
Ensure that pci-irq-affinity-agent is launched on worker nodes.
This includes AIO and standard configs.
Root cause is in this agent start script, it can be started
only if node type is worker. But for AIO, the node type is controller.
Then pmon will restart it again and again and cause controller degrade
in the end.
Below test for AIO pass
1) Pci-irq-affinity-agent started normally before openstack apply.
After openstack apply, related openstack config applied to
agent config file as expected.
2) Verified agent started normally in non-openstack worker node for
both AIO and multi-node.
No degrade in controller node.
Change-Id: I73e9dff0358b7ed86bfaaadac834e19fe227892f
Closes-Bug: #1828877
Signed-off-by: zhipengl <zhipengs.liu@intel.com>
With openstackclients moving to a container, issuing openstack
command to the application side would become difficult due to
log "kubectl" commands and the random nature of pod names.
This commit introduces a wrapper for the containerized client
that automatically passes the desired command to the pod.
This commit also introduces a wrapper for copying files dirrectly
to clients container for commands that need filesystem access
(for example creating images with "openstack image create").
We also alias the default openstack command to the containerized
client. The platform openstack command is aliased to
"platform-openstack".
Change-Id: I7b204bb05381d38f4f25066561e001bb8247943b
Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>
Story: 2005312
Task: 30603
Depends-on: I58a5d511cf54dacc018bfb88848899b92a774087
Create an agent which runs on each worker node to do pci interrupt
affinity work.
nova-sriov installed by this new package instead of old nova-utils.
Below test done and pass, see detailed test spec in story link.
1) deployment test with/without openstack application
2) Periodic audit pci irq affinity
3) Remove VM without sriov pci port
4) Remove VM with sriov pci port
5) Add VM without sriov pci port
6) Add VM with sriov pci port
7) Add VM without pci_irq_affinity_mask
8) Add VM without cpu policy set
9) VM resize test
10) Remove one pci port for VM
Code framework is like below
+------------+ +--------------+ +------------+
| | | | | |
| | | | | |
| Agent.py | -----> | affinity.py | -----> | driver.py |
| | | | | |
| Daemon | | Conduct | | Drv |
| | | | | |
+------------+ +--------------+ +------------+
Story: 2004600
Task: 28850
Depends-on: https://review.opendev.org/#/c/640263/
Depends-on: https://review.opendev.org/#/c/654415/
Change-Id: Ie668036efe4d0013fed8cd45805f0321692c76f0
Signed-off-by: zhipengl <zhipengs.liu@intel.com>
Most of the openstack processes are containerized so there is no
need for them to be included in the patch restart scripts, or
the syslog configuration and log rotation files.
Story: 2004764
Task: 30668
Change-Id: Ib342fa7b594cdafa5d7c7575044ea28783daf9d0
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
Flake8 currently ignores the following errors:
H401: docstring should not start with a space
H404: multi line docstring should start without a leading new line
H405: multi line docstring summary not separated with an empty line
Enable them for more consistent formatting of docstrings
Change-Id: I385e28e9c6eca3c02a3def51ff64b00b7a63a853
Story: 2004515
Task: 30076
Signed-off-by: Eric Barrett <eric.barrett@windriver.com>
Flake8 currently ignores the following Errors:
E121: continuation line under-indented for hanging indent
E123: closing bracket doesn't match indentation of opening bracket
E124: closing bracket doesn't match visual indentation
E125: continuation line with same indent as next logical line
E126: continuation line over-indented for hanging indent
E127: continuation line over-indented for visual indent
E128: continuation line under-indented for visual indent
Enable them for more consistent formatting of code
Change-Id: I415d4824a1f335ba3fceb488b0ae60b9861a036a
Story: 2004515
Task: 30076
Signed-off-by: Eric Barrett <eric.barrett@windriver.com>
Flake8 currently ignores a number of whitespace related errors:
E201: whitespace after '['
E202: whitespace before '}'
E203: whitespace before ':'
E211: whitespace before '('
E221: multiple spaces before operator
E222: multiple spaces after operator
E225: missing whitespace around operator
E226: missing whitespace around arithmetic operator
E231: missing whitespace after ','
E251: unexpected spaces around keyword / parameter equals
E261: at least two spaces before inline comment
Enable them for more thorough testing of code
Change-Id: Id03f36070b8f16694a12f4d36858680b6e00d530
Story: 2004515
Task: 30076
Signed-off-by: Eric Barrett <eric.barrett@windriver.com>
With the StarlingX move to supporting pure upstream OpenStack, the
majority of the SDK Modules are related to functionality no longer
supported. The remaining SDK Modules will be moved to StarlingX
documentation.
Story: 2005275
Task: 30173
Change-Id: I82823d26d02f23d39cbc715b78d339b63321e8d3
Signed-off-by: Kristine Bujold <kristine.bujold@windriver.com>
All rmon resource monitoring has been moved to collectd.
This update removes rmon from mtce and the load.
Story: 2002823
Task: 30045
Test Plan:
PASS: Build and install a standard system.
PASS: Inspect mtce rpm list
PASS: Inspect logs
PASS: Check pmon.d
Depends-On: https://review.openstack.org/#/c/643739
Change-Id: I927862895272fdd024d281ab49e0a128465b1b3f
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Remove references to nova and neutron api proxy
Remove the puppet patches that are no longer required
Story: 2004766
Task: 30020
Change-Id: I38daec333dd0a47376be014b4c108d3c92e0b963
Signed-off-by: Tao Liu <tao.liu@windriver.com>
This update introduces interface monitoring for oam,
mgmt and infra networks as a collectd plugin.
The interface plugin runs and queries the new maintenance
Link Monitor daemon for Link Model and Information every
10 seconds.
The plugin then manages alarms based on the link model similar
to how rmon did in the past ; port and interface alarms.
Severity: Interface and Port levels
Alarm Level Minor Major Critical
----------- ----- --------------------- ----------------------------
Interface N/A One of lag pair is Up All Interface ports are Down
Port N/A Physical Link is Down N/A
Degrade support for interface monitoring is add to the mtce
degrade notifier. Any link down condition results in a host
degrade condition like was in rmon.
Sample Data: represented as % of total links Up for that network interface
100 or 100% percent used - all links of interface are up.
50 or 50% percent used - one of lag pair is Up and the other is Down
0 or 0% percent used - all ports for that network are Down
The plugin documents all of this in its header.
This update also
1. Adds the new lmond process to syslog-ng config file.
2. Adds the new lmond process to the mtce patch script.
3. Modifies the cpu, df and memory threshold settings by -1.
rmon thresholds were precise whereas collectd requires
that the samples cross the thresholds, not just meet them.
So for example, in terms of a 90% usage action the
threshold needs to be 89.
Test Plan: (WIP but almost complete)
PASS: Verify interface plugin startup
PASS: Verify interface plugin logging
PASS: Verify interface plugin Link Status Query and response handling
PASS: Verify monitor, sample storage and grafana display
PASS: verify port and interface alarm matches what rmon produced
PASS: Verify lmon port config from manifest configured plugin
PASS: Verify lmon port config from lmon.conf
PASS: Verify single interface failure handling and recovery
PASS: Verify lagged interface failure handling and recovery
PASS: Verify link loss of lagged interface shared between mgmt and oam (hp380)
PASS: Verify network interface failure handling ; single port
PASS: Verify network interface degrade handling ; lagged interface
PEND: Verify network interface degrade handling ; vlan interface
PASS: Verify HTTP request timeout period and handling
PASS: Verify link status query failure handling - invalid uri (timeout)
PASS: Verify link status query failure handling - missing uri (timeout)
PASS: Verify link status query failure handling - status fail
PASS: Verify link status query failure handling - bad json resp
Change-Id: I2e2dfe6ddfa06a46770245540c7153d330bdf196
Story: 2002823
Task: 28635
Depends-On: https://review.openstack.org/#/c/633264
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
use proper absolute path to import modules
remove ignore case 'H301: one import per line'
Story: 2002909
Task: 24886
Change-Id: I1d72e68ead64492ff0c74f8c1bf1b460b573bc1e
Signed-off-by: Sun Austin <austin.sun@intel.com>
According to analysis from Saul in task 26455, we can remove motd
patch for crontabs and then use RPM instead of SRPM for it.
We also need to remove usage of --without-progname in utilities/
update-motd/files/motd-update.
Story: 2003765
Task: 28181 & 28182
Depends-on: https://review.openstack.org/#/c/623385/
Change-Id: I4be7d47ee77ac07eb24f5b88cd707c29b595df7a
Signed-off-by: zhipengl <zhipengs.liu@intel.com>
Previous commit 01f5fdd made a required change to filter
infrastructure traffic on the management interface with an 802.1q
protocol in the case of a consolidated interface.
However, this has caused the remote logging tc script to have a
failure. The script tries to install 'ip' protocol filters at the
same priority as the 802.1q filters, which is rejected by the
kernel.
This commit detects a consolidated interface situation and bumps
the priority of the remote logging tc filter priority on the
management interface, similarly to what is done in the main
cgcs_tc_setup script.
The file has also been cleaned up to pass bashate.
Related-Bug: #1807055
Change-Id: Id11625c0f9bcbf109f574563ff284d4a36bc6377
Signed-off-by: Steven Webster <steven.webster@windriver.com>
On master branch, PLATFORM_RELEASE should always have a value
that lies between that of the last release (18.10), and that of the
next anticipated release (19.06 was the last I heard).
Setting it to 19.01
Story: 2004596
Task: 28487
Change-Id: I5e34e1fcdec39f0ce0205ea94c73d8a5d5c73bc9
Signed-off-by: Scott Little <scott.little@windriver.com>
Sometime after kernel 3.10.0-514.16.1.X, tc filter commands no longer
match 802.1q packets when the filter protocol is set to 'ip'.
This poses a problem for a consolidated (eg. infra w/ vlan over
management) interface configuration.
The tc filter will operate properly on the vlan interface, but all
traffic will go to the default qdisc (low priority) when it arrives
with a vlan tag at the sub-interface.
This commit sets the filter protocol to '802.1q' in the case of a
subinterface with a vlan tagged interface ontop of it.
Some bashate cleanup has also been done on this file.
Closes-Bug: #1807055
Change-Id: I457faa2b56bbd270c104cc0313ffe3cc1bfd4db3
Signed-off-by: Steven Webster <steven.webster@windriver.com>
The log_functions.sh script file was dropped in a
recent edit of the compute-huge rpm.
Some scripts depend on this file for log utilities.
This update moves log_functions.sh out of compute-huge
into platform-util and re-installs it in its previous
location /etc/init.d
Story: 2004043
Task: 28462
Change-Id: I4efb0a63f29bc446e7efd86cea7488f3e2e362df
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
- add barbican logs in syslog
- support no reboot patching for barbican processes
- get information about barbican in collect script
Change-Id: I75557a2d35d3861c2dee3d0a5a0960bebc6d0e48
Story: 2003108
Task: 27700
Depends-On: I6b0b0c90456627bebde2b834b339bc968100b6f9
Signed-off-by: Alex Kozyrev <alex.kozyrev@windriver.com>
This update modifies the patching script to make hbsAgent a
pmon rather than SM monitored process.
Story: 2003576
Task: 24907
Depends-On: https://review.openstack.org/#/c/617835
Change-Id: Ifad7d6d67f0334d175330b524a165f92ca1cf489
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Use mecached-custom package to package service file to system
folder instead of platform-utils.
Basic deployment test pass and service file status check pass.
Story: 2004108
Task: 27517
Depends-on: https://review.openstack.org/#/c/614085/
Change-Id: Ic66f077159be2f21caa6e8e68241aae65b9f2245
Signed-off-by: zhipengl <zhipengs.liu@intel.com>
In Python 3 print is a function.
Especially for multiple string print, need to import
print_function from __future__.
Story: 2002909
Task: 24560
Signed-off-by: zhangyangyang <zhangyangyang@unionpay.com>
Change-Id: Ie31eb59368af57776eb9785dba494432111cd250
Problem:
Build scripts can supply overriding info to build-info package
by placing data in file $MY_WORKSPACE/BUILD. This was not documented
as an input source in build-info's build_srpm.data. Hence no md5sum
is captured for that file, and no md5sum delta is seen to trigger
a rebuild.
Solution:
Capture the build dependency using OPT_DEP_LIST in build-info's
build_srpm.data.
Story: 2002835
Task: 22754
Change-Id: If9d39e28cc3824bfe6a593c4cc7f4153cba5c42d
Signed-off-by: Scott Little <scott.little@windriver.com>
This update adds hooks to the spec files for the following packages
to generate wheels for the python modules:
- ceph-manager
- libvirt-python
- logmgmt
- platform-util
- python-3parclient
- python-cephclient
- python-lefthandclient
- python-ryu
- vm-topology
Change-Id: Ia63291e686818d19d0df52ff26b5f0bb3812b8ce
Story: 2003907
Task: 26787
Signed-off-by: Don Penney <don.penney@windriver.com>
This code change updates the PLATFORM_RELEASE variable from 18.08 to 18.10
Story: 2003085
Task: 26743
Change-Id: Iaf9b4cca984155e6b4d1217bbebaf7f5694b9ae4
Signed-off-by: Paul-Emile Element <Paul-Emile.Element@windriver.com>
Commit 685baaed added text file /etc/motd.head with execute
permission.
Partial-Bug: 1790863
Change-Id: I1104a5d7320f5fc32801c1129df589ce32fdf656
Signed-off-by: Michel Thebeau <michel.thebeau@windriver.com>
The cron job for update-motd will run at the top of the hour and clobber
an original /etc/motd file provided by 'setup' RPM.
Consider that update-motd is intended to replace the motd provided by
setup, and provide that same content in the update-motd package. In
this way, the original content from setup RPM will be displayed until
the cron job for update-motd runs. After that, unless the user has
changed to the motd, the same content will be presented.
Partial-Bug: 1790863
Change-Id: I2b66ce7475525920c52e129102fa859fb29e0eed
Signed-off-by: Michel Thebeau <michel.thebeau@windriver.com>
Keep memcached package intact and customize memcached service file by
overwriting it from platform-util.
Story: 2002826
Task: 24548
Depends-On: https://review.openstack.org/600867
Change-Id: Ic18d7efc1ea5548dc6245c7e9658843bd8d557cf
Signed-off-by: Jack Ding <jack.ding@windriver.com>
Fix below linters errors
E010 The "do" should be on same line as for
E010 The "do" should be on same line as while
E011 Then keyword is not on same line as if or elif keyword
E020 Function declaration not in format ^function name {$
Ignore:
E041 Arithmetic expansion using $[ is deprecated for $((
E042 local declaration hides errors
E043 Arithmetic compound has inconsistent return semantics
E044 Use [[ for non-POSIX comparisions
Story: 2003366
Task: 24423
Change-Id: I8b6b72e702d3e89d1813772d6bf16819e28e818c
Signed-off-by: Martin Chen <haochuan.z.chen@intel.com>
Change-Id: I567a0b6c7117680ba0dbe9e4006051ce9c1649d0
Story: 2003085
Task: 23167
Signed-off-by: Jack Ding <jack.ding@windriver.com>
Signed-off-by: Scott Little <scott.little@windriver.com>
This update adds support for no-reboot patching of the influxdb and
collectd processes to the patch-restart-processes script used to manage
restart of specified processes.
Depends-On: https://review.openstack.org/579307
Change-Id: I5b03dbf9242c32ea765840ede127335d440235d9
Signed-off-by: Jack Ding <jack.ding@windriver.com>
Signed-off-by: Scott Little <scott.little@windriver.com>