Commit Graph

9 Commits

Author SHA1 Message Date
Erich Cordoba c8735e882a Remove version from sm folder
The sm component had the 1.0.0 version in the folder name, this
change removes that version and updates the centos_pkg_dirs.

Story: 2006623
Task: 36827

Depends-On: https://review.opendev.org/#/c/685128/
Change-Id: I6725d1f961c2a82275da5fabbff8e89a8dd6f245
Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>
2019-09-26 14:11:31 -05:00
Bin Qian 7f52df37bd Fix IPv6 standby controller boot loop
IPv6 multicast should be sent to the interface that the socket
binds to.

Closes-Bug: 1842949
Change-Id: I14b6c5193c67a0ddd69e31d1044219c4e9fd6b94
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2019-09-09 13:15:59 +00:00
Bin Qian 51f63ee2e6 Ensure a timer is stop if it is orphan
When a service timer routine couldn't find the service it associates to,
the timer should not be rearmed (timer routine returns false).

Change-Id: Ia0f352b8c6cbd63176353a1b2054b216babad605
Story: 2005486
Task: 30623
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2019-04-26 10:57:11 -04:00
Bin Qian 0641b4a44e Fix h/w subsystem duplicated initialization
h/w subsystem is mistakenly initialized twice. It causes the
interface operational state changed events not being passed to
the listener. In the event an interface operational state changed,
i.e, cable is pulled, the system could not react to it.

Change-Id: I014d25befda536265c9c588a156ce411d01147cf
Closes-Bug: 1812019
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2019-01-28 13:02:32 -05:00
Bin Qian ad8665a1b7 Use hbs cluster info to determine best survivor
Uses cluster hbs info to determine which controller to be the survivor when
communication lost between 2 controllers with the new rules:

1. If a controller is the only controller to connect to storage-0,
it is choosen to be the survivor
2. A controller that can reach more nodes is choosen to be the survivor.
3. A controller is choosen to be failed if it cannot reach any nodes.

Story: 2003577
Task: 27704

Change-Id: I79659e1a788b865536500fc125fd65ae2f34123d
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2018-12-11 11:12:16 -05:00
Bin Qian 28e293bda5 Retrieve hbs cluster info
This change includes:
1. adds code to receive cluster info update from hbsAgent.
2. support of ondemand hbs cluster info query (asynchronous).

Depends-On: I7d294d40e84469df6b6a6f6dd490cf3c4557b711

Story: 2003577
Task: 27816

Change-Id: Idb65abc58b4afe9649aba442f0798c24d9fffb10
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2018-11-30 11:59:23 -05:00
Bin Qian edc8a56472 Introduce failover FSM
Introduce failover FSM to handle communication failure between
controllers.
Failover FSM has 4 states:
Normal: when system running with full redundency
Fail Pending: communication failure occured
Failed: the controller is determined as failure. Its peer will
        assume service
Survived: the controller is determined as survivor. Its peer has
        failed

The controllers are in one of the below possible state pairs:
normal/normal, fail-pending/fail-pending, failed/survived

A failed controller will not resume responsbility before the
system restores its full redundency (normal/normal)

A survivor will not fail before the system restores its
full redundency (normal/normal)

Future implementation may allow an administrator to force
a failed controller become active, to manually recover
(with possiblity of losing data), should the survivor is
no longer capable to provide service.

Story: 2003577
Task: 26404

Change-Id: I51635e9e60b6fb6bad89e06c9f08d3f28e21db82
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2018-09-18 08:08:40 -04:00
Bin Qian 68b5ce3835 SM to monitor infra i/f and swact when needed
Individual services should not fail itself and trigger swact when infra i/f goes down
SM will collect the overrall system healthy state to schedule the services.

Story: 2003577
Task: 24899

Change-Id: Ifa7453136f34768b99e2bcd741d1065e69ef452e
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2018-09-11 02:28:26 +00:00
Dean Troyer 17c909ec83 StarlingX open source release updates
Signed-off-by: Dean Troyer <dtroyer@gmail.com>
2018-05-31 07:36:26 -07:00