Mitigate memory leak of sessions by disabling sudo for sriov agent

The sriov agent was polling devices via 'sudo ip link show',
and this resulted in a severe memory leak. The usage of 'sudo'
uses the host 'dbus-daemon', and somewhere the host does not
clean up login sessions.

Symptoms:
- gradual run out of memory until system unstable, host spontaneous
  reboot due to delay or OOM
- huge growth of kernel slab
- thousands of /sys/fs/cgroup/systemd/user.slice/user-0.slice
  session-x*.scope files with empty 'tasks', i.e., sessions
  that should have deleted
- huge latency seen with ssh and various systemd commands

The problem is mitigated by disabling 'sudo' for sriov agent, using
a helm override that configures [agent]/root_helper='' .

Testing:
- Verified that we could launch a VM with SR-IOV interface;
  VFs were able to set MAC and VLAN attributes.

Closes-Bug: 1815106

Change-Id: I0c57629c01b7407c99cc7f38b409019ab87af859
Signed-off-by: Jim Gauld <james.gauld@windriver.com>
This commit is contained in:
Jim Gauld 2019-02-14 15:42:07 -05:00
parent 4abb322fb0
commit acefd544f0
1 changed files with 8 additions and 0 deletions

View File

@ -246,6 +246,14 @@ class NeutronHelm(openstack.OpenstackBaseHelm):
'securitygroup': {
'firewall_driver': 'noop',
},
# Mitigate host OS memory leak of cgroup session-*scope files
# and kernel slab resources. The leak is triggered using 'sudo'
# which utilizes the host dbus-daemon. The sriov agent frequently
# polls devices via 'ip link show' using run_as_root=True, but
# does not actually require 'sudo'.
'agent': {
'root_helper': '',
},
'sriov_nic': sriov_nic,
}