Improved Subcloud Deployment / Upgrading Error Reporting (GUI & CLI)

Add new items with example to sections.
Add new example figure.
Fix numbering.
Resize figures to fit the page.
Update log files name.
Add item 4 in include file.
RS> Conditionalize 'dcmanager subcloud list' states for
    various run scenarios
RS> Move step from .. only:: block to .. include::
RS> Pre-process included blocks in non-block context
RS> Remove debug message from script
EG> Added example
EG> Fixed caution box

Story: 2010271
Task: 46946

Signed-off-by: Elisamara Aoki Goncalves <elisamaraaoki.goncalves@windriver.com>
Change-Id: I3791a23fefba8d9d428ae208b3bfcb9274d49278
This commit is contained in:
Elisamara Aoki Goncalves 2022-11-24 09:31:02 -03:00
parent 0cc3212b79
commit 003d5d4439
10 changed files with 366 additions and 109 deletions

22
dirtyCheck.sh Executable file
View File

@ -0,0 +1,22 @@
#!/bin/bash
RED='\033[0;31m'
NC='\033[0m' # No Color
declare -a dirtyFiles
dirtyFiles=( $(git status --porcelain doc/source 2>/dev/null) )
echo "Checking status of doc/source"
if [ ${#dirtyFiles[@]} -ne 0 ]; then
echo -e "${RED}Repo is dirty. Please stash, add or manually delete the following files:${NC}"
for file in ${dirtyFiles[@]};
do
if [[ ${file} == "??" ]]; then continue; fi
echo -e "${RED}$file${NC}"
done
exit 1
else
echo "... OK"
fi

View File

@ -15,3 +15,49 @@
.. begin-syslimit
.. end-syslimit
.. begin-deploying-state
.. end-deploying-state
.. begin-add-the-subcloud-using-dcmanager
#. Add the subcloud using dcmanager.
When calling the :command:`subcloud add` command, specify the install
values, the bootstrap values and the subcloud's sysadmin password.
.. code-block:: none
~(keystone_admin)]$ dcmanager subcloud add \
--bootstrap-address <oam_ip_address_of_subclouds_controller-0> \
--bootstrap-values /home/sysadmin/subcloud1-bootstrap-values.yaml \
--sysadmin-password <sysadmin_password> \
--install-values /home/sysadmin/install-values.yaml \
--bmc-password <bmc_password>
If the ``--sysadmin-password`` is not specified, you are prompted to
enter it once the full command is invoked. The password is masked
when it is entered.
.. code-block:: none
Enter the sysadmin password for the subcloud:
(Optional) The ``--bmc-password <password>`` is used for subcloud
installation, and only required if the ``--install- values`` parameter is
specified.
If the ``--bmc-password <password>`` is omitted and the
``--install-values`` option is specified the system administrator will be
prompted to enter it, following the :command:`dcmanager subcloud add`
command. This option is ignored if the ``--install-values`` option is not
specified. The password is masked when it is entered.
.. code-block:: none
Enter the bmc password for the subcloud:
The :command:`dcmanager subcloud show` or :command:`dcmanager subcloud list`
command can be used to view subcloud add progress.
.. end-add-the-subcloud-using-dcmanager

Binary file not shown.

After

Width:  |  Height:  |  Size: 456 KiB

View File

@ -1,4 +1,5 @@
.. vbb1579292724479
.. _installing-a-subcloud-using-redfish-platform-management-service:
@ -285,18 +286,15 @@ command with the ``install-values.yaml`` file containing the desired
:start-after: begin-subcloud-1
:end-before: end-subcloud-1
.. only:: partner
.. include:: /_includes/installing-a-subcloud-using-redfish-platform-management-service.rest
:start-after: begin-prepare-files-to-copy-deployment-config
:end-before: end-prepare-files-to-copy-deployment-config
.. only:: starlingx
4. Add the subcloud using dcmanager.
#. Add the subcloud using dcmanager.
When calling the :command:`subcloud add` command, specify the install
values, the bootstrap values and the subclouds sysadmin password.
values, the bootstrap values and the subcloud's sysadmin password.
.. code-block:: none
@ -333,42 +331,17 @@ command with the ``install-values.yaml`` file containing the desired
command can be used to view subcloud add progress.
#. At the Central Cloud / System Controller, monitor the progress of the
subcloud install, bootstrapping, and deployment by using the deploy status
field of the :command:`dcmanager subcloud list` command.
.. code-block:: none
~(keystone_admin)]$ dcmanager subcloud list
+----+-----------+------------+--------------+---------------+---------+
| id | name | management | availability | deploy status | sync |
+----+-----------+------------+--------------+---------------+---------+
| 1 | subcloud1 | unmanaged | online | installing | unknown |
+----+-----------+------------+--------------+---------------+---------+
The **deploy status** field has the following values:
**Pre-Install**
This status indicates that the ISO for the subcloud is being updated by
the Central Cloud with the boot menu parameters, and kickstart
configuration as specified in the ``install-values.yaml`` file.
**Installing**
This status indicates that the subcloud's ISO is being installed from
the Central Cloud to the subcloud using the Redfish Virtual Media
service on the subcloud's |BMC|.
**Bootstrapping**
This status indicates that the Ansible bootstrap of |prod-long|
software on the subcloud's controller-0 is in progress.
**Complete**
This status indicates that subcloud deployment is complete.
The subcloud install, bootstrapping and deployment can take up to 30
minutes.
.. include:: /shared/_includes/installing-a-subcloud.rest
:start-after: begin-monitor-progress
:end-before: end-monitor-progress
.. caution::
If there is an installation failure, or a failure during bootstrapping,
you must delete the subcloud before re-adding it, using the
:command:`dcmanager subcloud add` command. For more information on
@ -380,24 +353,40 @@ command with the ``install-values.yaml`` file containing the desired
more information, see :ref:`Managing Subclouds Using the CLI
<managing-subclouds-using-the-cli>`.
#. You can also monitor detailed logging of the subcloud installation,
bootstrapping and deployment by monitoring the following log files on the
active controller in the Central Cloud.
``/var/log/dcmanager/ansible/<subcloud_name>_install.log``
``/var/log/dcmanager/ansible/<subcloud_name>_bootstrap.log``
#. If ``deploy_status`` shows an installation, bootstrap or deployment failure
state, you can use the ``dcmanager subcloud errors`` command in order to get
more detailed information about failure.
For example:
.. code-block:: none
controller-0:/home/sysadmin# tail /var/log/dcmanager/ansible/subcloud1_install.log
TASK [wait_for] ****************************************************************
ok: [subcloud1]
[sysadmin@controller-0 ~(keystone_admin)]$ dcmanager subcloud errors 1
FAILED bootstrapping playbook of (subcloud1).
detail: fatal: [subcloud1]: FAILED! => changed=true
failed_when_result: true
msg: non-zero return code
500 Server Error: Internal Server Error ("manifest unknown: manifest unknown")
Image download failed: admin-2.cumulus.mss.com: 30093/wind-river/cloud-platform-deployment-manager: WRCP_22.06 500 Server Error: Internal Server Error ("Get https://admin-2.cumulus .mss.com: 30093/v2/: dial tcp: lookup admin-2.cumulus.mss.com on 10.41.0.1:53: read udp 10.41.1.3:40251->10.41.0.1:53: i/o timeout")
Image download failed: gcd.io/kubebuilder/kube-rdac-proxy:v0.11.0 500 Server Error: Internal Server Error ("Get https://gcd.io/v2/: dial tcp: lookup gcd.io on 10.41.0.1:53: read udp 10.41.1.3:52485->10.41.0.1:53: i/o timeout")
raise Exception("Failed to download images %s" % failed_downloads)
Exception: Failed to download images ["admin-2.cumulus.mss.com: 30093/wind-river/cloud-platform-deployment-manager: WRCP_22.06", "gcd.io kubebuilder/kube-rdac-proxy:v0.11.0"]
FAILED TASK: TASK [common/push-docker-images Download images and push to local registry] Wednesday 12 October 2022 12:27:31 +0000 (0:00:00.042)
0:16:34.495
controller-0:/home/sysadmin# tail /var/log/dcmanager/ansible/subcloud1_bootstrap.log
#. You can also monitor detailed logging of the subcloud installation,
bootstrapping and deployment by monitoring the following log files on the
active controller in the Central Cloud.
``/var/log/dcmanager/ansible/<subcloud_name>_playbook.output.log``
For example:
.. code-block:: none
controller-0:/home/sysadmin# tail /var/log/dcmanager/ansible/subcloud1_playbook.output.log
k8s.gcr.io: {password: secret, url: null}
quay.io: {password: secret, url: null}
)

View File

@ -1,5 +1,8 @@
.. pja1558616715987
|hideable|
.. _installing-a-subcloud-without-redfish-platform-management-service:
==============================================================
@ -236,53 +239,30 @@ subcloud, the subcloud installation process has two phases:
~(keystone_admin)]$ system certificate-install -m docker_registry path_to_cert
.. include:: /_includes/installing-a-subcloud-without-redfish-platform-management-service.rest
:start-after: begin-prepare-files-to-copy-deployment-config
:end-before: end-prepare-files-to-copy-deployment-config
9. At the Central Cloud / System Controller, monitor the progress of the
#. At the Central Cloud / System Controller, monitor the progress of the
subcloud bootstrapping and deployment by using the deploy status field of
the :command:`dcmanager subcloud list` command.
.. code-block:: none
.. include:: /shared/_includes/installing-a-subcloud.rest
:start-after: begin-monitor-progress
:end-before: end-monitor-progress
~(keystone_admin)]$ dcmanager subcloud list
+----+-----------+------------+--------------+---------------+---------+
| id | name | management | availability | deploy status | sync |
+----+-----------+------------+--------------+---------------+---------+
| 1 | subcloud1 | unmanaged | online | complete | unknown |
+----+-----------+------------+--------------+---------------+---------+
The deploy status field has the following values:
**Bootstrapping**
This status indicates that the Ansible bootstrap of |prod-long|
Platform software on the subcloud's controller-0 is in progress.
**Complete**
This status indicates that subcloud deployment is complete.
The subcloud bootstrapping and deployment can take up to 30 minutes.
.. caution::
If there is a failure during bootstrapping, you must delete the
subcloud before re-adding it, using the :command:`dcmanager subcloud
add` command. For more information on deleting, managing or unmanaging
a subcloud, see :ref:`Managing Subclouds Using the CLI
<managing-subclouds-using-the-cli>`.
10. You can also monitor detailed logging of the subcloud bootstrapping and
#. You can also monitor detailed logging of the subcloud bootstrapping and
deployment by monitoring the following log files on the active controller
in the Central Cloud.
``/var/log/dcmanager/ansible/<subcloud\_name>\_bootstrap.log``
/var/log/dcmanager/ansible/<subcloud\_name>\_playbook.output.log
For example:
.. code-block:: none
controller-0:/home/sysadmin# tail /var/log/dcmanager/ansible/subcloud1_bootstrap.log
controller-0:/home/sysadmin# tail /var/log/dcmanager/ansible/subcloud1_playbook.output.log
k8s.gcr.io: {password: secret, url: null}
quay.io: {password: secret, url: null}
)
@ -321,7 +301,5 @@ subcloud, the subcloud installation process has two phases:
- For more information on bootstrapping and deploying, see the procedure
`Install a subcloud
<https://docs.starlingx.io/deploy_install_guides/r5_release/distributed_cloud/index.html#install-a-subcloud>`__,
<https://docs.starlingx.io/deploy_install_guides/r7_release/distributed_cloud/index.html#install-a-subcloud>`__,
step 4.

View File

@ -166,4 +166,22 @@ fails, delete subclouds, and monitor or change the managed status of subclouds.
You must reinstall a deleted subcloud before re-adding it.
- To show detailed information about subcloud ``install/bootstrap/deploy``
failures, use the :command:`subcloud errors <subcloud-name>` command.
For example:
.. code-block:: none
[sysadmin@controller-0 ~(keystone_admin)]$ dcmanager subcloud errors 1
FAILED bootstrapping playbook of (subcloud1).
detail: fatal: [subcloud1]: FAILED! => changed=true
failed_when_result: true
msg: non-zero return code
500 Server Error: Internal Server Error ("manifest unknown: manifest unknown")
Image download failed: admin-2.cumulus.mss.com: 30093/wind-river/cloud-platform-deployment-manager: WRCP_22.06 500 Server Error: Internal Server Error ("Get https://admin-2.cumulus .mss.com: 30093/v2/: dial tcp: lookup admin-2.cumulus.mss.com on 10.41.0.1:53: read udp 10.41.1.3:40251->10.41.0.1:53: i/o timeout")
Image download failed: gcd.io/kubebuilder/kube-rdac-proxy:v0.11.0 500 Server Error: Internal Server Error ("Get https://gcd.io/v2/: dial tcp: lookup gcd.io on 10.41.0.1:53: read udp 10.41.1.3:52485->10.41.0.1:53: i/o timeout")
raise Exception("Failed to download images %s" % failed_downloads)
Exception: Failed to download images ["admin-2.cumulus.mss.com: 30093/wind-river/cloud-platform-deployment-manager: WRCP_22.06", "gcd.io kubebuilder/kube-rdac-proxy:v0.11.0"]
FAILED TASK: TASK [common/push-docker-images Download images and push to local registry] Wednesday 12 October 2022 12:27:31 +0000 (0:00:00.042)
0:16:34.495

View File

@ -14,6 +14,7 @@ subclouds from the System Controller.
- To list subclouds, select **Distributed Cloud Admin** \> **Cloud Overview**.
.. image:: figures/cloud-overview.png
:width: 800
You can perform full-text searches or filter by column using the search-bar
@ -30,6 +31,7 @@ subclouds from the System Controller.
Group name.
.. image:: figures/cloud-overview-edit-subcloud.png
:width: 800
- Confirm changes and check the new assignment in the Subcloud summary.
@ -44,3 +46,12 @@ subclouds from the System Controller.
Web interface for that subcloud. To switch back to the System Controller,
use the subcloud or region selection menu at the top left of the Horizon
window.
- To show detailed information about subcloud ``install/bootstrap/deploy``
failures, you select Distributed **Cloud Admin** > **Cloud Overview**.
Then click on dropdown arrow. At the end you can get the subcloud error
information.
.. figure:: ./figures/bootrap_failed_regis_horiz.png
:width: 800

View File

@ -0,0 +1,68 @@
.. begin-monitor-progress
For example:
.. code-block:: none
~(keystone_admin)]$ dcmanager subcloud list
+----+-----------+------------+--------------+---------------+---------+
| id | name | management | availability | deploy status | sync |
+----+-----------+------------+--------------+---------------+---------+
| 1 | subcloud1 | unmanaged | online | complete | unknown |
+----+-----------+------------+--------------+---------------+---------+
If deploy_status shows an installation, bootstrap or deployment failure state,
you can use the :command:`dcmanager subcloud errors` command in order to get
more detailed information about failure.
For example:
.. code-block:: none
sysadmin@controller-0 ~(keystone_admin)]$ dcmanager subcloud errors 1
FAILED bootstrapping playbook of (subcloud1).
detail: fatal: [subcloud1]: FAILED! => changed=true
failed_when_result: true
msg: non-zero return code
500 Server Error: Internal Server Error ("manifest unknown: manifest unknown")
Image download failed: admin-2.cumulus.mss.com: 30093/wind-river/cloud-platform-deployment-manager: WRCP_22.06 500 Server Error: Internal Server Error ("Get https://admin-2.cumulus .mss.com: 30093/v2/: dial tcp: lookup admin-2.cumulus.mss.com on 10.41.0.1:53: read udp 10.41.1.3:40251->10.41.0.1:53: i/o timeout")
Image download failed: gcd.io/kubebuilder/kube-rdac-proxy:v0.11.0 500 Server Error: Internal Server Error ("Get https://gcd.io/v2/: dial tcp: lookup gcd.io on 10.41.0.1:53: read udp 10.41.1.3:52485->10.41.0.1:53: i/o timeout")
raise Exception("Failed to download images %s" % failed_downloads)
Exception: Failed to download images ["admin-2.cumulus.mss.com: 30093/wind-river/cloud-platform-deployment-manager: WRCP_22.06", "gcd.io kubebuilder/kube-rdac-proxy:v0.11.0"]
FAILED TASK: TASK [common/push-docker-images Download images and push to local registry] Wednesday 12 October 2022 12:27:31 +0000 (0:00:00.042)
0:16:34.495
The **deploy status** field has the following values:
.. container:: hideable
``Pre-Install``
This status indicates that the ISO for the subcloud is being updated by
the Central Cloud with the boot menu parameters, and kickstart
configuration as specified in the ``install-values.yaml`` file.
``Installing``
This status indicates that the subcloud's ISO is being installed from
the Central Cloud to the subcloud using the Redfish Virtual Media
service on the subcloud's |BMC|.
.. container::
``Bootstrapping``
This status indicates that the Ansible bootstrap of |prod-long|
software on the subcloud's controller-0 is in progress.
``Complete``
This status indicates that subcloud deployment is complete.
.. include:: /_includes/installing-a-subcloud-using-redfish-platform-management-service.rest
:start-after: begin-deploying-state
:end-before: end-deploying-state
The subcloud bootstrapping and deployment can take up to 30 minutes.
.. end-monitor-progress

122
normalize-includes.sh Executable file
View File

@ -0,0 +1,122 @@
#!/usr/bin/env bash
directive='pre-include::'
d_begin=':start-after:'
d_end=':end-before:'
inc_base='doc/source'
RED='\033[0;31m'
NC='\033[0m'
message () { echo -e "$@" 1>&2; }
error () { message $RED$@$NC; }
OIFS=$IFS; IFS=$'\n'
parents=( $(grep -Rl '.. pre-include:: ' --exclude-dir=docs/build --include="*.r*st" --exclude-dir='.*' doc/*) )
IFS=$OIFS
check_file_deps () {
for filereq in $@
do
if [ ! -f "${filereq}" ] && [ ! -L "${filereq}" ]; then error "${filereq} not found. Quiting."; exit 1; fi
done
}
check_util_deps () {
for dep in $@
do
if ! hash $dep 2>/dev/null; then
error >&2 "... $dep dependency not met. Please install."
exit 1
fi
done
}
get_substr () {
local _str=${1//\//\\/}
local _drop=$2
local _regex="$_drop\s+(.*)\s*$"
message "${_str} =~ $_regex"
if [[ "${str}" =~ "$_regex" ]]
then
message "Found ${BASH_REMATCH[1]}"
echo "${BASH_REMATCH[1]}"
else
echo ""
fi
}
trimspaces () {
local _s=$1
_s=${_s##*( )}
_s=${_s%%*( )}
echo $_s
}
get_inc_path () {
local _ppath=$1
local _inc=$2
if [[ ${_inc::1} == "/" ]]; then
echo "$inc_base$_inc"
else
echo "$(dirname $_ppath)/$_inc"
fi
}
get_include_content () {
local _inc_file=$1
local _inc_start=$2
local _inc_end=$3
local _content
check_file_deps $_inc_file
if [[ $_inc_start != "" ]] && [[ $_inc_end != "" ]]; then
_content=$(awk "/.. $_inc_start/{flag=1; next} /.. $_inc_end/{flag=0} flag" "$_inc_file" | sed -r "s/^\s*\.\. $_inc_end\s*$//g")
# _content=$(sed -n '/\.\. $_inc_start/,/\.\. $_inc_end/{p;/\.\. $_inc_end/q}' $_inc_file)
elif [[ $_inc_start == "" ]] && [[ $_inc_end == "" ]]; then
_content=$(<$_inc_file)
else
error "Something went horribly wrong"
fi
echo "$_content"
}
## Run
for _f in "${parents[@]}"; do
readarray -t _content < "$_f"
for i in "${!_content[@]}"; do
if [[ ${_content[i]} =~ $directive ]]; then
_inc_file=$(trimspaces $(echo ${_content[i]} | sed -r 's|\s*\.\. pre-include::\s+||g'))
message "found ${_content[i]}: $_l\nExtracted $_inc_file"
if [[ ${_content[i+1]} =~ $d_begin ]]; then
_inc_start=$(trimspaces $(echo ${_content[i+1]} | sed -r 's|\s*:start-after:\s+||g'))
_inc_end=$(trimspaces $(echo ${_content[i+2]} | sed -r 's|\s*:end-before:\s+||g'))
if [[ $_inc_end == "" ]]
then
error "start/end paramter mismatch in\n$_f\n Quiting"
exit 1
fi
message "Extracted $_inc_start"
message "Extracted $_inc_end"
_content[i+1]=""
_content[i+2]=""
fi
_inc_file=$(get_inc_path "$_f" "$_inc_file")
_includestring=$(get_include_content $_inc_file $_inc_start $_inc_end)
_content[i]="$_includestring"
fi
# ((_line=$_line+1))
done
for line in "${_content[@]}"; do out="$out\n$line"; done
echo -e "${out//\\n/$'\n'}" > $_f
done

View File

@ -17,13 +17,16 @@ deps =
-r{toxinidir}/doc/requirements.txt
commands =
git clean -dfx doc/source/fault-mgmt/
bash ./dirtyCheck.sh
bash ./get-remote-files.sh -c templates/events.sh -o file -f
python parser.py -l templates/alarms_template.rst -e tmp/events.yaml -s 100,200,300,400,500,700,800,900 -ts = -type Alarm -outputPath doc/source/fault-mgmt/kubernetes/ -sort Yes -product starlingx -replace "|,OR"
python parser.py -l templates/logs_template.rst -e tmp/events.yaml -s 100,200,300,400,500,700,800,900 -ts = -type Log -outputPath doc/source/fault-mgmt/kubernetes/ -sort Yes -product starlingx -replace "|,OR"
python parser.py -l templates/alarms_template.rst -e tmp/events.yaml -s 100,200,300,400,500,700,800,900 -ts = -type Alarm -outputPath doc/source/fault-mgmt/openstack/ -sort Yes -product openstack -replace "|,OR"
python parser.py -l templates/logs_template.rst -e tmp/events.yaml -s 100,200,300,400,500,700,800,900 -ts = -type Log -outputPath doc/source/fault-mgmt/openstack/ -sort Yes -product openstack -replace "|,OR"
bash ./normalize-includes.sh
sphinx-build -a -E -W --keep-going -d doc/build/doctrees -t starlingx -t openstack -b html doc/source doc/build/html {posargs}
git clean -dfx doc/source/fault-mgmt/
git restore doc/source/dist_cloud/kubernetes/*
bash htmlChecks.sh doc/build/html
whitelist_externals = bash
htmlChecks.sh