integ/ceph
Gabriel de Araújo Cabral 8f6d2eb85a Restart the ceph-mgr daemon every 7 days to control RSS memory growth
The ceph-mgr has a behavior where its RSS memory grows continuously.
In a few months, depending on the system, this may carry out more
than 1GB of growth. In tests performed on storage and duplex systems,
the average growth is around 10MiB per day on the active controller.

Since Ceph is open source, a thorough search was performed on the
Internet and Ceph repo for information about this growth behavior
in memory consumption of ceph-mgr, both in Ceph 14.2.22 (present
on the system) and in later versions. However, nothing that could
help to fix the problem was found. As there were no reports about
this bug, I reported it on the Ceph tracker: https://tracker.ceph.com/issues/61702

A new approach to fix the problem is to automatically restart
ceph-mgr every 7 days, so the memory use goes back to the initial
state when the daemon is restarted, avoiding the possibility of
memory overflow. Also, it was verified that there weren't any
impacts on the running processes after the restart.

Test-Plan:
  PASS: Changed the fix in an AIO-DX to restart ceph-mgr every one
        day.
  PASS: After one day, ceph-mgr restarted and its RSS memory use went
        back to the initial state.

Closes-Bug: 2023553

Change-Id: I1c62efaf0ca1d37ba93a24fc99b8db7156973102
Signed-off-by: Gabriel de Araújo Cabral <gabriel.cabral@windriver.com>
2023-06-15 16:29:42 +00:00
..
ceph Restart the ceph-mgr daemon every 7 days to control RSS memory growth 2023-06-15 16:29:42 +00:00