Resolve AIO-SX shutdown hang with CEPH ordering hooks

In ceph-10.2.6, the ceph init script uses systemd-run to launch
ceph-mon and ceph-osd services. This generates transient systemd
service files with basic configuration. On node shutdown, ceph is
getting shutdown while it is still in use by containers, and without
unmapping the RBD devices, causing the libceph kernel module to
hang trying to communicate with the ceph monitor.

This update patches the ceph init script to generate systemd
overrides config files for the ceph-mon and ceph-osd that provide
improved ordering during shutdown, as well as a script to run
as part of the docker.service shutdown (by packaging a systemd
override) to unmap the RBD devices. This ordering ensures kubelet
and docker services are shutdown first, then the RBD devices are
cleaned up, followed by the shutdown of the ceph services and
service management (SM). Once kubelet and docker have shut down,
the ceph-preshutdown.sh script is able to cleanly unmount and
unmap the RBD devices and unload the rbd and libceph
kernel modules.

In ceph-11.0.1, the use of systemd-run was replaced with proper
systemd service configuration files. Once ceph is upgraded for
StarlingX, the ordering and cleanup will need to be revisited.

Story: 2004520
Task: 28258
Change-Id: I6f7d7b9e704121c54211afd86b38df015b8d7a63
Signed-off-by: Don Penney <don.penney@windriver.com>
This commit is contained in:
Don Penney 2019-02-05 17:59:09 -05:00
parent dc14b89999
commit a883e82866
5 changed files with 103 additions and 2 deletions

View File

@ -1,6 +1,6 @@
SRC_DIR="$CGCS_BASE/git/ceph"
COPY_LIST="files/*"
COPY_LIST="files/* $DISTRO/patches/*"
TIS_BASE_SRCREV=3f07f7ff1a5c7bfa8d0de12c966594d5fb7cf4ec
TIS_PATCH_VER=GITREVCOUNT
TIS_PATCH_VER=GITREVCOUNT+1
BUILD_IS_BIG=40
BUILD_IS_SLOW=26

View File

@ -241,6 +241,10 @@ Source9: ceph-rest-api.service
Source10: ceph-radosgw.service
Source11: stx_git_version
Source12: ceph-preshutdown.sh
Source13: starlingx-docker-override.conf
Patch0001: 0001-Add-hooks-for-orderly-shutdown-on-controller.patch
%if 0%{?suse_version}
%if 0%{?is_opensuse}
@ -797,6 +801,7 @@ python-cephfs instead.
#################################################################################
%prep
%setup -q
%patch0001 -p1
# StarlingX: Copy the .git_version file needed by the build
# This commit SHA is from the upstream src rpm which is the base of this repo branch
# TODO: Add a commit hook to update to our latest commit SHA
@ -976,6 +981,8 @@ install -m 700 %{SOURCE7} %{buildroot}/usr/sbin/osd-wait-status
install -m 644 %{SOURCE8} $RPM_BUILD_ROOT/%{_unitdir}/ceph.service
install -m 644 %{SOURCE9} $RPM_BUILD_ROOT/%{_unitdir}/ceph-rest-api.service
install -m 644 %{SOURCE10} $RPM_BUILD_ROOT/%{_unitdir}/ceph-radosgw.service
install -m 700 %{SOURCE12} %{buildroot}%{_sbindir}/ceph-preshutdown.sh
install -D -m 644 %{SOURCE13} $RPM_BUILD_ROOT/%{_sysconfdir}/systemd/system/docker.service.d/starlingx-docker-override.conf
install -m 750 src/init-ceph %{buildroot}/%{_initrddir}/ceph
install -m 750 src/init-radosgw %{buildroot}/%{_initrddir}/ceph-radosgw
@ -1016,6 +1023,8 @@ rm -rf %{buildroot}
%config(noreplace) %{_sysconfdir}/ceph/ceph.conf
%{_sysconfdir}/services.d/*
%{_sbindir}/ceph-manage-journal
%{_sbindir}/ceph-preshutdown.sh
%{_sysconfdir}/systemd/system/docker.service.d/starlingx-docker-override.conf
%endif
%if %{without stx}
%{_unitdir}/ceph-create-keys@.service

View File

@ -0,0 +1,59 @@
From 03340eaf0004e3cc8e3f8991ea96a46757d92830 Mon Sep 17 00:00:00 2001
From: Don Penney <don.penney@windriver.com>
Date: Sat, 26 Jan 2019 13:34:55 -0500
Subject: [PATCH] Add hooks for orderly shutdown on controller
Hook the ceph init script to add systemd overrides to define
an orderly shutdown for StarlingX controllers.
Signed-off-by: Don Penney <don.penney@windriver.com>
---
src/init-ceph.in | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/src/init-ceph.in b/src/init-ceph.in
index 1fdb4b3..515d818 100644
--- a/src/init-ceph.in
+++ b/src/init-ceph.in
@@ -861,6 +861,38 @@ for name in $what; do
fi
fi
+ . /etc/platform/platform.conf
+ if [ "${nodetype}" = "controller" ]; then
+ # StarlingX: Hook the transient services launched by systemd-run
+ # to allow for proper cleanup and orderly shutdown
+
+ # Set nullglob so wildcards will return empty string if no match
+ shopt -s nullglob
+
+ OSD_SERVICES=$(for svc in /run/systemd/system/ceph-osd*.service; do basename $svc; done | xargs echo)
+ for d in /run/systemd/system/ceph-osd*.d; do
+ cat <<EOF > $d/starlingx-overrides.conf
+[Unit]
+Before=docker.service
+After=sm-shutdown.service
+
+EOF
+ done
+
+ for d in /run/systemd/system/ceph-mon*.d; do
+ cat <<EOF > $d/starlingx-overrides.conf
+[Unit]
+Before=docker.service
+After=sm-shutdown.service ${OSD_SERVICES}
+
+EOF
+ done
+
+ shopt -u nullglob
+
+ systemctl daemon-reload
+ fi
+
[ -n "$post_start" ] && do_cmd "$post_start"
[ -n "$lockfile" ] && [ "$?" -eq 0 ] && touch $lockfile
;;
--
1.8.3.1

View File

@ -0,0 +1,30 @@
#!/bin/bash
#
# Copyright (c) 2019 Wind River Systems, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
script=$(basename $0)
# Set nullglob so wildcards will return empty string if no match
shopt -s nullglob
for dev in /dev/rbd[0-9]*; do
for mnt in $(mount | awk -v dev=$dev '($1 == dev) {print $3}'); do
logger -t ${script} "Unmounting $mnt"
/usr/bin/umount $mnt
done
logger -t ${script} "Unmounted $dev"
done
for dev in /dev/rbd[0-9]*; do
/usr/bin/rbd unmap -o force $dev
logger -t ${script} "Unmapped $dev"
done
lsmod | grep -q '^rbd\>' && /usr/sbin/modprobe -r rbd
lsmod | grep -q '^libceph\>' && /usr/sbin/modprobe -r libceph
exit 0

View File

@ -0,0 +1,3 @@
[Service]
ExecStopPost=/usr/sbin/ceph-preshutdown.sh