This change patches zmq GarbageCollector to use zmq.Context()
from eventlet.green instead of default zmq.Context().
It was identified that sysinv-agent process was hanged. It was blocked
on zmq garbage collector recv() call. Replacing the Garbage Collector to
use the green Context solves the issue.
Test Plan:
PASS: Build package with build-pkgs -p pyzmq
PASS: Build ISO
PASS: Install on lab, configure ACC100, backup system
PASS: Reinstall and restore system, then host-unlock
Closes-Bug: 2060867
Change-Id: I229a8a4c70ebb4d7056fa2ff60bfc910bf12b257
Signed-off-by: Alyson Deives Pereira <alyson.deivespereira@windriver.com>
This commit changes how we differentiate Helm charts when uploading new
StarlingX applications. The method previously used was based on
comparing SHA256 digests, which was causing the helm-upload script to
mistakenly report charts with the same implementation as different
after rebuilding them with no changes.
The new implementation uses the diff tool to perform such comparison so
that charts with the same implementation are reported as equals
regardless of whether they were rebuilt.
In addition, two new parameters were added to the helm-upload script:
* 'check-only': check if charts are valid without uploading them to
the given repository.
* 'upload-only': upload charts to the given repository bypassing the
preliminary checks.
The new parameters aim for more flexibility when integrating with other
pieces of software such as sysinv.
Test Plan:
PASS: build-pkgs -a && build-image
PASS: AIO-SX fresh install.
PASS: Update platform-integ-apps containing a rebuilt version of
ceph-pools-audit with no changes.
Confirm that the app was successfully updated.
PASS: Update platform-integ-apps containing a rebuilt version of
ceph-pools-audit containing changes to values.yaml but keeping the
same version number.
Confirm that the app update failed.
PASS: Run helm-upload with the 'check-only' parameter and confirm that
no charts were uploaded.
PASS: Run helm-upload with the 'upload-only' parameter and confirm that
charts were correctly uploaded.
PASS: Run helm-upload without the new parameters and confirm that the
original behavior was preserved.
Partial-Bug: 2053074
Change-Id: I45f6482118f5ecf9da1b51f21fbaf0db63eb321c
Signed-off-by: Igor Soares <Igor.PiresSoares@windriver.com>
This change allows ts2phc to be configured to ignore timing updates that
have a large offset spike in order to mitigate the resulting timing
skew.
In some circumstances on realtime systems with high CPU load, the
timestamp consumed by ts2phc can be delayed in reaching ts2phc and
results in the offset calculation attempting to speed the clock up by a
large margin.
This change causes ts2phc to ignore updates that would greatly skew the
clock when ts2phc is already in a synchronized state.
The global configuration option "max_phc_update_skip_cnt" is provided to
allow users to specify how many consecutive offset spike incidents will
be ignored before adjusting the clock. The default value is 120. The
behaviour can be disabled by setting max_phc_update_skip_cnt to 0.
This code is ported from a proposed upstream patch found here:
https://sourceforge.net/p/linuxptp/mailman/message/44114092/
Test-plan:
Pass: Verify linuxptp package build
Pass: Deploy ts2phc binary and verify system time sync
Pass: Manually trigger offset spike and verify that ts2phc maintains
stable time sync
Closes-bug: https://bugs.launchpad.net/starlingx/+bug/2059955
Change-Id: I13cd5c3440682ec9256e11449fe62d5fe28f66fa
Signed-off-by: Cole Walker <cole.walker@windriver.com>
Update version of Trident Installer to 24.02.0 to keep compatibility
with version 1.29 of k8s. Supports k8s from 1.24 to 1.29.
Test Plan:
- PASS: Tested Trident 24.02.0 installation and communication with
NetApp simulator.
- PASS: Tested the Trident update from version 23.10.0 to 24.02.0,
upgrading tridentctl client version and rerunning the
ansible-playbook to update the server version.
Story: 2011080
Task: 49784
Change-Id: Iaaf673f00fbc28c50f0bdacdb5a644000626f765
Signed-off-by: Erickson Silva de Oliveira <Erickson.SilvadeOliveira@windriver.com>
This change updates runc package from 1.1.7 to 1.1.12
and fixes the vulnerability issue - CVE-2024-21626.
https://nvd.nist.gov/vuln/detail/CVE-2024-21626
Test Plan:
PASS: runc package builds successfully
PASS: Build ISO successful with multiple kubernetes versions
PASS: Verify correct runc vesion on deployed system,
dpkg-query -f '${Version}' -W runc
PASS: Performed the K8s version upgrade from 1.24.4 to 1.28.4
PASS: Verify platform cpu occupancy is normal using
collectd.log and occtop tool
Closes-bug: https://bugs.launchpad.net/starlingx/+bug/2052401
Change-Id: Ia34c4a1bcab777a9af80e2b045960895f2bed976
Signed-off-by: Ramesh Kumar Sivanandam <rameshkumar.sivanandam@windriver.com>
This modifies kubeadm UpgradeManifestTimeout from 5 minutes default
to 3 minutes to reduce the unnecessary delay in retries during
kubeadm-upgrade-apply failures.
The typical control-plane upgrade of static pods is 75 to 85 seconds,
so 3 minutes gives adequate buffer to complete the operation.
TEST PLAN:
PASS: All Kubernetes packages build successfully from 1.24 to 1.28.
PASS: Perform k8s upgrade and verify kubeadm-upgrade-apply.log
shows the UpgradeManifestTimeout value as 3 minutes.
Partial-Bug: 2056326
Change-Id: Ief35c63dacc92af861525f03fa25ceb7b8253622
Signed-off-by: Ramesh Kumar Sivanandam <rameshkumar.sivanandam@windriver.com>
This change ports the following kubernetes 1.29.2 patches which were
refactored slightly to allow for upstream changes
The following patches were applied cleanly:
kubelet-sort-isolcpus-allocation-when-SMT-enabled.patch
kubelet-cpumanager-infra-pods-use-system-reserved-CP.patch
Affinity-of-guaranteed-pod-to-non-isolated-CPUs.patch
kubelet-CFS-quota-throttling-for-non-integer-cpulimit.patch
The following patches were refactored:
kubeadm-create-platform-pods-with-zero-CPU-resources.patch
kubernetes-make-isolcpus-allocation-SMT-aware.patch
kubelet-cpumanager-disable-CFS-quota-throttling.patch
kubelet-cpumanager-keep-normal-containers-off-reserv.patch
kubelet-cpumanager-introduce-concept-of-isolated-CPU.patch
Test Plan:
PASS: Kubernetes package 1.29.2 builds properly.
PASS: Run all Kubelet, kubeadm, kubectl make tests for affected code.
Story: 2011047
Task: 49674
Change-Id: Ib24dc061a7da201650cc6550fd7bbed0aebe390c
Signed-off-by: Boovan Rajendran <boovan.rajendran@windriver.com>
tzdata expires every 6-12 months.
Update to the latest tzdata, valid until Dec 2024
The new tzdata is supplied by upstream, we no longer need
to build it ourselves. We just need to be sure it is included
in the iso.
Verification:
- tzdata is no longer built
- build-iso and make sure it contains the new package
- check the package to ensure it contains the
expected leap-seconds.list file
- boot the iso and ensure nothing weird observed
regarding the date
- run "export TZ=/usr/share/zoneinfo/EST5EDT" followed
by the date command and ensure that it displays the
correct time for that timezone
Partial-Bug: 2054466
Change-Id: I765dc225f9b9f23799af662cd87fe94703857241
Signed-off-by: Scott Little <scott.little@windriver.com>
This change updates kubernetes package from 1.29.1 to 1.29.2
and it uses golang-1.21.7.
Test Plan:
PASS: kubernetes-1.29.2 package builds successfully
PASS: All packages build successfully
PASS: Build ISO successful with multiple kubernetes versions
PASS: For pkg-versioning, add a dummy commit to subdirectory
of kubernetes-1.29.2. Built package kubernetes-1.29.2
and verified that package version was incremented by 1.
PASS: Install the ISO as AIO-SX and verify the K8s 1.29.2 staged
binaries are present in the path /usr/local/kubernetes/1.29.2
Story: 2011047
Task: 49654
Depends-On: https://review.opendev.org/c/starlingx/compile/+/910697
Change-Id: Ib463753fe82527d64d7b0e5605895b0ed2c48e49
Signed-off-by: Ramesh Kumar Sivanandam <rameshkumar.sivanandam@windriver.com>
This change pulls in an upstream linuxptp fix to initialize the tm_isdst
variable.
An unitialized tm_isdst variable in ts2phc can result in mktime failing
and cause ts2phc to be unable to sync time with a "invalid master time
stamp" error.
The fault was intermittent based on the random value in the unitialized
variable. If it was read as a positive integer, mktime would fail and
the symptom would occur.
The upstream commit id is:
63fc1ef4fd5e5fc45dd4de3bf27920bb109a4357
Test plan:
Pass: Verify package build
Pass: Deploy updated ts2phc binary and perform repeated service
start/stops. The fault was not reproduced after 20 attempts.
Closes-bug: https://bugs.launchpad.net/starlingx/+bug/2055464
Change-Id: I9fb1722c6ab93f6bb9ec6cdc4fbe902a823b3e2e
Signed-off-by: Cole Walker <cole.walker@windriver.com>
This commit updates the containernetworking-plugins and
bond-cni pkgs to use golang-1.18.
Test Plan:
- PASS: downloader
- PASS: build pkgs
- PASS: build image
- PASS: the plugins are present at /var/opt/cni/bin/
- PASS: test the plugins' functionality
Story: 2010878
Task: 49619
Change-Id: Ie8e0f01502e74cf2fb7a4b3ba88c37b69609c297
Signed-off-by: Mohammad Issa <mohammad.issa@windriver.com>
The scripts in ifupdown-0.8.36 and ifupdown-extra-0.32, as they are
distributed, don't work correctly for detecting duplicate IP addresses
and gateway reachability in the interfaces. Because of this, error
messages are thrown in daemon.log even if error conditions don't exist.
This commit fixes the detection logic and also improves the log logic,
so that messages carry useful and accurate information.
Test plan
Systems: AIO-SX IPv4, AIO-SX IPv6
Scenarios without error/warning conditions
------------------------------------------
For these scenarios, OAM is over a regular ethernet interface, gateway
is reachable and there are no duplicate IP addresses. Log messages
must reflect this.
[PASS] mgmt and cluster-host over same eth port, pxe unassigned
[PASS] mgmt and cluster-host over same bond port, pxe unassigned
[PASS] mgmt and cluster-host over same vlan port, pxe unassigned
[PASS] mgmt and cluster-host over same vlan port, pxe assigned to
base eth
[PASS] mgmt and cluster-host over different vlan ports, pxe assigned
to base bond
Scenarios with error/warning conditions
---------------------------------------
For these scenarios, error/warning messages must appear and reflect
the real conditions.
[PASS] Cable disconnected in ethernet interface
[PASS] Cable disconnected in bonding interface
[PASS] Duplicate address in ethernet interface
[PASS] Duplicate address in vlan interface
[PASS] Duplicate address in bonding interface
[PASS] Missing gateway in ethernet interface
[PASS] Missing gateway in vlan interface
[PASS] Missing gateway in bonding interface
Closes-Bug: #2052534
Change-Id: Ie9152eff51f21bdcb8693f554eb696d63e2bab34
Signed-off-by: Lucas Ratusznei Fonseca <lucas.ratuszneifonseca@windriver.com>
This update added ipsec-server service to systemd preset config
to enable it on controllers.
Test Plan (DX system):
PASS: Install and bootstrap controller-0, verify ipsec-server is
"enabled" and "vendor preset: enabled" after first reboot and
bootstrap.
Story: 2010940
Task: 49583
Depends-On: https://review.opendev.org/c/starlingx/metal/+/907348
Change-Id: I41d4fdb9f9adc857234981e04de1a5a4e8af8721
Signed-off-by: Leonardo Mendes <Leonardo.MendesSantana@windriver.com>
This adds kubernetes 1.29.1 package for Debian, this is built
using golang-1.21.6.
Taken from the previous version and modified the files for 1.29.1.
Test Plan:
PASS: kubernetes-1.29.1 package builds successfully
PASS: All packages build successfully
PASS: Build ISO successful with multiple kubernetes versions
PASS: For pkg-versioning, add a dummy commit to subdirectory
of kubernetes-1.29.1. Built package kubernetes-1.29.1
and verified that package version was incremented by 1.
PASS: Install the ISO as AIO-SX and verify the K8s 1.29.1 staged
binaries are present in the path /usr/local/kubernetes/1.29.1
Story: 2011047
Task: 49591
Depends-On: https://review.opendev.org/c/starlingx/compile/+/909068
Change-Id: I97b4a3a25ca93a2b414a1600f3ba8bd0f16b1e8c
Signed-off-by: Ramesh Kumar Sivanandam <rameshkumar.sivanandam@windriver.com>
This change updates kubernetes patch
kubelet-cpumanager-introduce-concept-of-isolated-CPU.patch
for supported kubernetes versions from 1.24 to 1.28.
Currently, for static CPU allocation, pods are identified
as platform pods using a hard-coded list of namespaces.
New method identifies a pod as a platform pod using label
assigned to it or its namespace.
Test Plan:
PASS: All affected versions of kubernetes package build successfully.
PASS: Create a pod with the platform label. Pod is classified as
a platform pod.
PASS: Create a pod without the platform label but in a namespace with
the platform label. Pod is classified as a platform pod.
PASS: Create a pod without the platform label and in a namespace
without the platform label. Pod is not classified as a platform
pod.
Depends-On: https://review.opendev.org/c/starlingx/config/+/907640
Depends-On: https://review.opendev.org/c/starlingx/ansible-playbooks/+/907641
Depends-On: https://review.opendev.org/c/starlingx/integ/+/908340
Depends-On: https://review.opendev.org/c/starlingx/integ/+/908958
Story: 2010612
Task: 47513
Change-Id: I654d466e51522b42a2e1d17a1828288089791b8f
Signed-off-by: Kaustubh Dhokte <kaustubh.dhokte@windriver.com>
This change covers up for the missed kubernetes version 1.24.4
in this change.
https://review.opendev.org/c/starlingx/integ/+/908340
Test Plan:
PASS: Kubernetes 1.24.4 package builds successfully.
Story: 2010878
Task: 49546
Change-Id: Iff11cd4ee8239bed5875100b4499216e80e27386
Signed-off-by: Kaustubh Dhokte <kaustubh.dhokte@windriver.com>
This update added strongswan IPSec daemon (charon) to systemd
preset config to enable it on all types of systems.
Test Plan (DX system):
PASS: Install and bootstrap controller-0, verify IPSec service is
"enabled" and "vendor preset: enabled" after first reboot and
bootstrap.
PASS: Unlock controller-0, verify IPSec service is enabled and
"vendor preset: enabled" after unlock.
PASS: Install controller-1, verify IPSec service is enabled and
"vendor preset: enabled" after first reboot.
Story: 2010940
Task: 49482
Co-Authored-By: Andy Ning <andy.ning@windriver.com>
Change-Id: I2bc122f080e33b87fd1b6535d1817df2a9cb0b52
Signed-off-by: Leonardo Mendes <Leonardo.MendesSantana@windriver.com>
As we no longer have any users for this feature, we remove the patch
enable-support-for-kubernetes-to-ignore-isolcpus.patch from the repo.
Test Plan:
PASS: Each affected kubernetes version package builds successfully.
Story: 2010878
Task: 49546
Change-Id: Id21fe6cd029d4b3cd6e6bd920628dfcc4703f6b2
Signed-off-by: Kaustubh Dhokte <kaustubh.dhokte@windriver.com>
Following checks and enhacement are done in this commit
to handle the patching scenarios:
- Added check for encryption-proider.yaml to be moved
to luks volume from /etc/kubernetes directory if not
present.
- If encryption-proider.yaml already present in luks
volume and also exists in /etc/kubernetes directory,
then delete the encryption-proider.yaml file from
/etc/kubernetes directory.
- Remove the encryption-provider.yaml from the
/opt/platform/config/${sftw_ver}/kubernetes
if exists.
Test Plan:
PASSED: build-pkgs -c -p luks-fs-mgr
PASSED: build-image
PASSED: AIO-SX patch testing: Verified that the
encryption-proider.yaml file is present only in
luks volume. Luks service is up and running.
Story: 2010873
Task: 49533
Change-Id: If0891ed5b93f538953912e22afc940c6e4742800
Signed-off-by: Rahul Roshan Kachchap <rahulroshan.kachchap@windriver.com>
Storage and Compute node going in "degraded" state due to high
cpu usage for luks-fs-mgr. Currently the service keeps checking
the luks volume status and exits when it is in inactive state.
This is a redundant activity as the maintenance code already
checks volume status and raises the alarm.
This code change exits the main thread of the service on compute
and storage nodes after unsealing the volume.
Test Plan:
PASS: build-pkgs -c -p luks-fs-mgr
PASS: build-image
PASS: AIO-DX plus: verify if service stops after unsealing luks
volume on compute and storage nodes and there is no high
cpu usage alarm.
PASS: AIO-DX plus: verify if luks service continue running
on controller nodes.
Story: 2010872
Task: 49517
Change-Id: I7cb2cbf6761b429cb06e5b100e0bfdbfce43f94c
Signed-off-by: Jagatguru Prasad Mishra <jagatguruprasad.mishra@windriver.com>
The stx/nfv/mtce-guest service has been deprecated and is no longer
built as part of the nfv git.
https://opendev.org/starlingx/nfv/commit/
bfded2ded62263695ec37fb6214eda7b191c1cbc
However, removing the guestServer and guestAgent systemd presets
were missed.
Therefore, as a final cleanup effort for these deprecated
services, this update removes all references to both the
guestAgent and guestServer from starlingX systemd-presets.
Test Plan:
PASS: Full clean Debian build
PASS: Debian ISO install Standard system with worker and storage
PASS: Verify guestServer and guestAgent service files are not packaged.
Related-Bug: 2051389
Change-Id: I4b0dfa1739f35b0ceab3b6b98a9b24eb53caa1a9
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
This change updates etcd version to 3.4.27.
The new etcd version does not generate package named 'etcd'.
Etcd server binary (/usr/bin/etcd) is packed in the package
'etcd-server'. So a patch is added to the etcd puppet module
to update the package name. Also, as we do not use /etc/etcd/etcd.yml,
another patch is added to remove its generation. Etcd 3.3.25 would
create a new user 'etcd'. As no processes or files require etcd user
context, it is removed in the new version. Etcd process and config
files are managed by puppet and are owned by the root user.
Depends-On: https://review.opendev.org/c/starlingx/integ/+/897091
Depends-On: https://review.opendev.org/c/starlingx/tools/+/897100
Depends-On: https://review.opendev.org/c/starlingx/stx-puppet/+/897099
Depends-On: https://review.opendev.org/c/starlingx/stx-puppet/+/898851
Test Plan:
PASS: All packages build and build image successful
PASS: AIO-SX, AIO_DX fresh install success with new etcd version.
PASS: CRUD operations on a test pod successful.
PASS: Lock/Unlock reboot succeeds. K8s cluster healthy after each
operation. Test pod persists upon lock/unlock and reboot.
PASS: AIO-SX platform upgrade successful. K8s cluster healthy after
platform upgrade.
Story: 2010878
Task: 48877
Change-Id: Ifb4d7d5c8f4d3dbf754f117db75408bff9181464
Signed-off-by: Kaustubh Dhokte <kaustubh.dhokte@windriver.com>
Revert-use-subpath-for-coredns-only-for-default-repo.patch
is removed as this change that updates the dns
imageRepository is taken care in ansible playbook review
https://review.opendev.org/c/starlingx/ansible-playbooks/+/903499
Test Plan:
PASS: Kubernetes package 1.25.3, 1.26.1 and 1.27.5
builds properly.
PASS: Verify k8s upgrade from 1.24.4 to 1.25.3
Story: 2010878
Task: 49244
Change-Id: Ic5a825f88f625db10610cc7e19770a0a36b6aad4
Signed-off-by: Boovan Rajendran <boovan.rajendran@windriver.com>
Luks service creates a symbolic link to encryption-provider.yaml
at /etc/kubernetes from the luks volume. Symlink must be present
only on the controller node only.
This commit adds the code to create the symlink to
encryption-provider.yaml file based on the personality.
Test Plan:
PASSED: build-pkgs -c -p luks-fs-mgr
PASSED: bootstrap
PASSED: symlinks are created at /etc/kubernetes/ for
controllers only and not for compute/storage
Story: 2010873
Task: 49438
Change-Id: I048e880ef97a17d745f20dd7d247df71cb53eae8
Signed-off-by: Rahul Roshan Kachchap <rahulroshan.kachchap@windriver.com>
Fixed the port id map in the Port Data Set event handling. The port id
is composed by port number and node index after the HA implementation.
Code tidying. As definition, the port id and the port number are
different. An existing port number variable was rennamed to
prevent missinterpretation.
Code tidying. The HA node state change processing was disabled
when HA feature is not enabled.
Test plan:
PASS: Verify the phc2sys executable recognizes the port in the port
state change event, when -a configuration option is used
PASS: Verify the events in the HA scenario are being recognized
Story: 2010723
Task: 49405
Change-Id: Iea2b3c4e7d7dcd07ca2ad52bc4042f80282b1a9a
Signed-off-by: Andre Mauricio Zelak <andre.zelak@windriver.com>
Following enhancements and fixes are done in this commit:
- Added code for handling graceful exit of the service.
- Fixed code to remove segfault core-dump.
- Added return value for copyKubeProviderFile() function
so that service is exited in case of failure.
- Used inotifytools package to detect file change and
creation recursively.
- Fixed issue related to removal of luks mount path.
Test Plan:
PASSED: Successfully deployed ISO on AIO-DX
PASSED: Both the controllers are up and running
PASSED: No segfault or luks-fs-mgr service crash
is observed after deployment
PASSED: symlinks are created at /etc/kubernetes/ and
/opt/platform/config/23.09/kubernetes/ folders.
PASSED: All the files/directories created on the
/var/luks/stx/luks_fs/controller/ directory
on active controller are pushed onto the luks volume
on standby controller.
PASSED: Tested Push functionality from active to standby controller.
by modifying a file inside a subdirectory on LUKS/controller.
PASSED: Standby controller is able to pull luks/controller
from the active controller. Verified on the Standard setup
using HOST-SWACT command.
PASSED: Removed the copy of encryption-provider.yaml file from
/opt/platform/config/<SW_VERSION>/kubernetes/
(To support patch installation)
PASSED: LUKS service comes up after unmounting and removal of LUKS
mount path.
Depends-On: https://review.opendev.org/c/starlingx/tools/+/904556https://review.opendev.org/c/starlingx/root/+/904558
Story: 2010873
Task: 49375
Change-Id: I26e7f5c72baf2095bea4df4ef34bec22d0f93aed
Signed-off-by: Harshad sonde <harshad.sonde@windriver.com>
Fixed the behavior when HA is disabled, one interface has been
configured and '-a' autoconfiguration option is enabled in a
phc2sys instance. The behavior before HA feature was to ignore
the given interface. To keep compatibility with earlier
configurations, interfaces in the configuration file are
ignored if HA is disabled.
Test Plan: non HA
PASS: Verify behavior when HA is disabled and interface has been
configured.
PASS: Verify behavior when HA is ommited and interface has been
configured.
PASS: Verify behavior when HA is disabled and no interface has
been configured.
Test Plan: HA
PASS: Verify phc2sys exit with error when HA is enabled and
one interface has been configured.
Test Plan: Build
PASS: Verify patch application and package build
Closes-bug: 2048085
Change-Id: Ia65c157cfd63b637bd3ae3d7e370407e82371305
Signed-off-by: Andre Mauricio Zelak <andre.zelak@windriver.com>