This commit fixes an LDAP authentication issue seen on worker nodes
of a subcloud after a rehoming procedure was performed.
There are two main parts:
1. Since every host of a subcloud authenticates with the system
controller, we need to reconfigure the LDAP URI across all nodes
of the system when the system controller network changes (upon
rehome). Currently, it is only being reconfigured on controller
nodes.
2. Currently, the system uses an SNAT rule to allow worker/storage
nodes to authenticate with the system controller when the admin
network is in use. This is because the admin network only exists
between controller nodes of a distributed cloud. The SNAT rule
is needed to allow traffic from the (private) management network
of the subcloud over the admin network to the system controller
and back again. If the admin network is _not_ being used,
worker/storage nodes of the subcloud can authenticate with the
system controller, but routes must be installed on the
worker/storage nodes to facilitate this. It becomes tricky to
manage in certain circumstances of rehoming/network config.
This traffic really should be treated in the same way as that
of the admin network.
This commit addresses the above by:
1. Reconfiguring the ldap_server config across all nodes upon
system controller network changes.
2. Generalizing the current admin network nat implementation to
handle the management network as well.
Test Plan:
IPv4, IPv6 distributed clouds
1. Rehome a subcloud to another system controller and back again
(mgmt network)
2. Update the subcloud to use the admin network (mgmt -> admin)
3. Rehome the subcloud to another system controller and back again
(admin network)
4. Update the subcloud to use the mgmt network (admin -> mgmt)
After each of the numbered steps, the following were performed:
a. Ensure the system controller could become managed, online, in-sync
b. Ensure the iptables SNAT rules were installed or updated
appropriately on the subcloud controller nodes.
c. Log into a worker node of the subcloud and ensure sudo commands
could be issued without LDAP timeout.
d. Log into worder node with LDAP USER X via console and verify
login succeed
In general, tcpdump was also used to ensure the SNAT translation was
actually happening.
Partial-Bug: #2056560
Change-Id: Ia675a4ff3a2cba93e4ef62b27dba91802811e097
Signed-off-by: Steven Webster <steven.webster@windriver.com>