Commit Graph

12 Commits

Author SHA1 Message Date
Daian Cardoso Sganderlla 940b39b77e Delay time for boot to detect mpath disks
Give the boot enough time to be able to detect mpath disks.
In some cases boot was taking some time to detect mpath,
generating an error on installation.

Story: 2010046
Task: 45489

Test Plan:

PASS: AIO-DX Executed on lab with mpath

Signed-off-by: Daian Cardoso Sganderlla <Daian.CardosoSganderlla@windriver.com>
Change-Id: I4b79b00ff2e3f4db309284bcf7251b57ef524a9c
2022-05-27 20:16:59 +00:00
Zuul be95984d5c Merge "kickstarts: add support for mpath device" 2022-05-24 14:23:09 +00:00
Jackie Huang da68897268 kickstarts: add support for mpath device
The device node in /dev/ and device path in /dev/disk/by-path
can not be used directly for mpath devices, use /dev/mapper/mpathN
and /dev/disk/by-id/dm-uuid-mpath-<WWID> instead.

Test Plan:

* with local disks
PASS: install Standard with default boot menu
PASS: install AIO with default boot menu
PASS: install AIO (lowlatency) with boot_device= rootfs_device=

* with mpath device
PASS: install AIO with boot_device= rootfs_device=
PASS: install AIO with boot_device=mpatha rootfs_device=mpatha
PASS: install Standard with boot_device=mpatha rootfs_device=mpatha
PASS: install AIO (lowlatency)with boot_device=mpatha rootfs_device=mpatha
PASS: install AIO with boot_device=mpathb rootfs_device=mpathb
PASS: install AIO with boot_device=/dev/mapper/mpatha rootfs_device=/dev/mapper/mpatha

Failure Path:

* with local disks
PASS: install AIO with boot_device=mpatha rootfs_device=mpatha

* with mpath device
PASS: install AIO with default boot menu

Story: 2010046
Task: 45419

Signed-off-by: Jackie Huang <jackie.huang@windriver.com>
Signed-off-by: Thiago Miranda <ThiagoOliveira.Miranda@windriver.com>
Change-Id: I60c5b2a7c7ca42d8e5e36ea517327c8a5431dde7
2022-05-21 21:20:02 +00:00
emacdona 1a6fd50e2c Support Redfish subcloud install with single packages fetch
This update modifies the miniboot kickstart so rpm Packages
repodata and patches are fetched from the system controller
only once during a Redfish subcloud install.

The existing packages mirror operation done in the %post phase
is moved to the %pre phase. However, since the in-service feed
dir does not exist at that time, the fetched content is
staged to /mnt/install/repo as

    /mnt/install/repo/Packages
    /mnt/install/repo/repodata
    /mnt/install/repo/patches

The %post phase fetch code that was moved is replaced with code
that positions the %pre fetched content to the local filesystem
here:

    # feed directory
    /var/www/pages/feed/rel-xx.xx/Packages
    /var/www/pages/feed/rel-xx.xx/repodata

    # updates directory
    /var/www/pages/updates/rel-xx.xx/Packages
    /var/www/pages/updates/rel-xx.xx/repodata

    # patching directory
    /opt/patching/packages
    /opt/patching/metadata

The miniboot kickstart is updated with a detailed description of
the various staging/prestaging install options along with logging
enhancements that show the source and destination flow of
packages, repodata and patches in the installation process.

Also modified kickstart to ensure that Prestaged Content takes
precedence over prestaged ISO. Both of which take precedence over
Staged Install.

Moved Prestaged content check to %pre stage som that the next
priority install option can be taken rather than a belated
install failure.

Test Plan

PASS: Verify single packages pull over subcloud controller-0 install
PASS: Verify with both UEFI and BIOS across various servers
PASS: Verify end-to-end subcloud installs (60+)

Packages

PASS: Verify install time is shorter
PASS: Compare feed directory before/after install
PASS: Compare package/repodata data transfer before/after install

Patch Handling

PASS: Compare patched content before/after installed subcloud
PASS: Verify patched updates are mirrored and auto applied
PASS: Verify patches can be applied and remove after install
PASS: Verify patch handling for both prestaged iso committed
      patches as well as non-committed patches.
PASS: Verify patch handling when installing subclouds from a system
      controller that has committed patches for all 3 install cases.

Failure Handling

PASS: Verify staging feed failure leads to installation
      failure ; parity behavior
PASS: Verify handling of failed lighttpd during packages download
PASS: Verify report_post_failure change handling

Robustness

PASS: Verify install to wiped disk
PASS: Verify reinstall of same controller
PASS: Verify expected lighttpd traffic logs
PASS: Verify success path logs
PASS: Verify install case logs show the install source/destination
PASS: Verify failure path error handling and messages

Prestaging Cases:

PASS: Verify prestaged ISO end-to-end install (25+)
PASS: Verify Prestaged Content install is priority 1
PASS: Verify Prestaged ISO install is priority 2
PASS: Verify Staged install is priority 3
PASS: Verify cascade of priority handling due to errors
PASS: Verify patches fetch from SC for Prestaged content install
PASS: Verify patches fetch from SC for Prestaged ISO install
PASS: Verify Prestaged Content Install with and without valid
      and invalid prestaged images
PASS: Verify Prestaged ISO install with and without prestaged valid
      and invalid prestaged images
PASS: Verify Prestaged ISO check failure leads to Prestaged Content
      Install
PASS: Verify Prestaged ISO check failure leads to Staged Install
      with incomplete prestaged content
PASS: Verify Prestaged ISO check failure leads to Staged install
      if prestaged repodata or Packages dir is not present or empty

Change-Id: I54ff9f7570d77efbb860904094adc5b6b8a3bf8d
Closes-Bug: 1971157
Signed-off-by: emacdona <eric.macdonald@windriver.com>
2022-05-04 07:42:57 -04:00
Eric MacDonald e4d4fec506 Handle prestage partition create and cleanup when needed
This update ensures the 'Platform Backup' partition label
is assigned to a newly created partition so that it gets
mounted properly.

This update cleans the prestage directory prior to new
prestaging.

Test Plan:

PASS: Verify prestaging partition create if missing
PASS: Verify prestaging cleanup before staging new files
PASS: Verify prestaged failure handling with small disk.
PASS: Verify prestaging image install and login.
PASS: Verify back to back prestaging installs
PASS: Verify reboot recovery after prestaging install

Story: 2009291
Task: 43825
Change-Id: I8f51b9d91ba7fd32e722a37f8336397959ba6ed6
Depends-On: https://review.opendev.org/c/starlingx/metal/+/817779
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2021-11-12 13:59:04 -05:00
Eric MacDonald 728161d9a9 Add new prestaging kickstart bundle to support Local Install
This update introduces a new kickstart bundle and kickstart
files that are used to prestage a subcloud server for Local
Install.

The new kickstart bundle uses the 'controller' install group.

A new prestaging kickstart file is created to copy and check
prestaging content from the prestaging iso.

A new prestaging packaging kickstart file is created with
content to exclude unnecessary packages to speed up install
time. Drops installed packages from 1180 to 473. Further
exclusions could/can be made given more test time.

login is with sysadmin/sysadmin with logged in sudo support.

This update also modifies the Miniboot kickstart to allow
the iso image to be an arbitrary name.

Test Plan:

PASS: Verify prestaging kickstart function and logs
PASS: Verify graphical and console install
PASS: Verify login with sysadmin/sysadmin
PASS: Verify sudo access once logged in
PASS: Verify prestaging install with reduced package set
PASS: Verify prestaging failures can be debugged
PASS: Verify various failure handling and reporting cases
      - no iso install source
      - no prestaging dir in repo
      - no prestaging software version dir in repo
      - failure to mount local prestaging dir
      - iso md5sum check failure
PASS: Verify Miniboot support for arbitrary iso name
PASS: Verify end-to-end prestaging and local subcloud install

Change-Id: I5d3569867a8a1f4b3727076ec89e85bdf0a7a014
Co-authored-by: Scott Little <scott.little@windriver.com>
Story: 2009291
Task: 43802
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2021-11-12 10:43:26 -05:00
M. Vefa Bicakci 46bc928ce6 kickstart functions: Fix empty variable check
This commit fixes two empty variable checks in the get_disk_dev function
in the kickstart shell functions. The "[" helper's "-n" operator needs
an argument to be able to detect whether the argument is an empty string
or not. When $disk is an empty shell variable, then '[ -n $disk ]' will
return zero (i.e., true). The correct way to check for the empty string
is to double-quote the variable; i.e., '[ -n "$disk" ]'.

The impact of this bug was observed with the v5.10 kernel, because the
v5.10 kernel uses the device node "sda" for USB storage in some
configurations, whereas the legacy v3.10 kernel always made "sda" the
first non-USB (i.e., internal) storage unit. A USB-based "sda" would
result in an empty $disk value, which would incorrectly be detected as a
non-empty variable due to the bug fixed by this commit. This in turn
would result in an installation failure due to the inability to find the
first non-USB storage unit on the system. For future reference, the
error messages were as follows:

"""
There was an error running the kickstart script at line 429. This is a
fatal error and installation will be aborted. The details of this error
are:
2021-10-09 08:12:45.716 - ISO_DEV=''.
2021-10-09 08:12:45.716 - USB_DEV=''.
readlink: missing operand
Try 'readlink --help' for more information.
readlink: missing operand
Try 'readlink --help' for more information.
2021-10-09 08:12:45.735 - Found rootfs on: ->.
readlink: missing operand
Try 'readlink --help' for more information.
readlink: missing operand
Try 'readlink --help' for more information.
2021-10-09 08:12:45.747 - Found boot on: ->.

Installation failed.

ERROR: Specified installation () or boot () device is invalid.

Pane is dead
"""

Verification:
- Installation of an ISO prepared with this patch and a v5.10 kernel
  works as expected on a system with a USB drive attached.

Closes-Bug: #1950163

Change-Id: Iee6a6bfb79b80df68f888f2eb498ec2cdbae6147
Signed-off-by: M. Vefa Bicakci <vefa.bicakci@windriver.com>
2021-11-08 09:15:32 -05:00
Ovidiu Poncea 7a0a2dac1a Fix issues with controller node Anaconda hang
On some deployments install fails as we keep one FD open
during install. This leads to hangs when Anaconda
'post' stage returns.

On other deployments install fails as udev sometimes creates
multiple links to the same devices in /dev/disk/by-path.
We iterate through this list and, because they are not unique,
we try to run flocks multiple times for the same device.
Locking a device multiple times doesn't work, the second
flock waits for first lock to release.

This commit:
 o removes 'exec {stdout}>&1' from ks-functions.sh so it no
   longer opens FDs in 'post' stage. For the pre stage we open
   it only when needed;
 o makes sure that list of storage devices is unique;
 o increases timeout of udevadm settle from its default of 180s
   to 300s, the value used throughout Anaconda. This helps
   with slower hardware.

Closes-Bug: 1889427
Change-Id: I348f10d96a78ea2c1c25fe6cf48462b0bc31fb84
Signed-off-by: Ovidiu Poncea <ovidiu.poncea@windriver.com>
2020-07-30 14:20:33 +00:00
Ovidiu Poncea 9b5148e3b5 Harden kickstarts as udev behavior can lead to random failures
Whenever a dev node that is not in use is opened with open(O_RDWR)
udev triggers a flush in devtmpfs that briefly remove & recreate all
the nodes for partitions on that device. This leads to commands
accessing dev nodes during the flush to fail. In our case blkid and
lsblk failed.

These failures are hard to reproduce, have devastating effect on
the partitioning operations and are not solved by using 'udevadm settle'
as some of the kernel events are asynchronous.

So, mainly, this commit stops udev from messing up with /dev nodes by
initializing file descriptors for all storage devices then opening
locks on them with flock. Setting locks stops udev triggering kernel
partition rescan.

Locks are set at the start of the partitioning operation and
released at the end.

For more details and similar cases see:
 o 02ba8fb335
 o http://tracker.ceph.com/issues/14080
 o http://tracker.ceph.com/issues/15176

This commit:
 o stops udev messing up with /dev nodes;
 o aborts install on critical failures;
 o adds retry for critical operations such as LVM cleanup or
   partition removal and creation.

Closes-Bug: 1888938
Change-Id: Iaaaaaae973ee36f2c4bfd42c327e8c6278d59303
Signed-off-by: Ovidiu Poncea <ovidiu.poncea@windriver.com>
2020-07-28 16:14:28 +00:00
Saul Wold f11e52b000 Select disk via kickstart
If the boot_device and/or rootfs_device are not set on the kernel
cmdline detect them early in the kickstart script. This can help
solve the issue about which disk type is to be used.

Remove the older code for disk detection
Tested with both sda and nvme disk types

Story: 2007486
Task: 39204
Change-Id: I4fa3b44a4e656e280820ceeefafaf127cb048df6
Signed-off-by: Saul Wold <sgw@linux.intel.com>
2020-04-02 17:23:02 -07:00
Tao Liu 5a44b5be49 Configurable Host HTTP/HTTPS Port Binding
Update pxeboot-update script to accept parameter for
installer base URL

Add a common function to parse the port number from
inst.repo

Update pxeboot and kickstart URLs to support a configurable
HTTP port

Story: 2004642
Task: 28593
Depends-On: https://review.openstack.org/#/c/634237/

Change-Id: Ibd66e89e49794ca57b938eb43d227860eda6674a
Signed-off-by: Tao Liu <tao.liu@windriver.com>
2019-02-06 16:04:07 -06:00
Dean Troyer 18922761a6 StarlingX open source release updates
Signed-off-by: Dean Troyer <dtroyer@gmail.com>
2018-05-31 07:36:43 -07:00