1
0
Fork 0
mirror of https://github.com/moby/moby.git synced 2022-11-09 12:21:53 -05:00
Commit graph

10 commits

Author SHA1 Message Date
Derek McGowan
710e0664c4 Update logrus to v1.0.1
Fix case sensitivity issue
Update docker and runc vendors

Signed-off-by: Derek McGowan <derek@mcgstyle.net>
2017-08-07 11:20:47 -07:00
Flavio Crisciani
d6440c9139 optimize the rebroadcast for failure case
Before when a node was failing, all the nodes would bump the lamport time of all their
entries. This means that if a node flap, there will be a storm of update of all the entries.
This commit on the base of the previous logic guarantees that only the node that joins back
will readvertise its own entries, the other nodes won't need to advertise again.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-08-01 14:08:54 -07:00
Madhu Venugopal
59994bbb15 Merge pull request #1775 from sanimej/gossip
Handle single manager reload by having workers reconnect
2017-05-31 14:57:34 -07:00
Santhosh Manohar
ca9a768d80 Handle single manager reload by having workers reconnect
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2017-05-31 14:36:23 -07:00
Flavio Crisciani
f585f33042 Node failure timeout fix
The time to keep a node failed into the failed node list
was originally supposed to be 24h.

If a node leaves explicitly it will be removed from the list of nodes
and put into the leftNodes list. This way the NotifyLeave event won't
insert it into the retry list.
NOTE: if the event is lost instead the behavior will be the same as a failed node.

If a node fails, the NotifyLeave will insert it into the failedNodes
list with a reapTime of 24h. This means that the node will be checked
for 24h before being completely forgot. The current check time is every
1 second and is done by the reconnectNode function.
The failed node list is updated every 2h instead.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>
2017-05-22 17:19:31 -07:00
Madhu Venugopal
bb560a1f44 Generating node discovery events to the drivers from networkdb
With the introduction of networkdb, the node discovery events were not
sent to the drivers. This commit generates the node discovery events and
sents it to the drivers interested in it.

Signed-off-by: Madhu Venugopal <madhu@docker.com>
2017-02-01 17:54:51 -08:00
Santhosh Manohar
e98b152bac Reap failed nodes after 24 hours
Signed-off-by: Santhosh Manohar <santhosh@docker.com>
2016-10-20 11:24:04 -07:00
Jana Radhakrishnan
5f5dad3c02 Recover from transient gossip failures
Currently if there is any transient gossip failure in any node the
recoevry process depends on other nodes propogating the information
indirectly. In cases if these transient failures affects all the nodes
that this node has in its memberlist then this node will be permenantly
cutoff from the the gossip channel. Added node state management code in
networkdb to address these problems by trying to rejoin the cluster via
the failed nodes when there is a failure. This also necessitates the
need to add new messages called node event messages to differentiate
between node leave and node failure.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-09-19 15:58:14 -07:00
Jana Radhakrishnan
f5f576ad34 Properly purge node networks when node goes away
When a node goes away purge all the network attachments from the node
and make sure we don't attempt bulk syncing to that node once removed.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-06-14 12:39:38 -07:00
Jana Radhakrishnan
28f4561e3f Add network scoped gossip database
Network DB is a network scoped gossip database built
on top of hashicorp/memberlist providing an eventually
consistent state store.

It limits the scope of the gossip and periodic bulk syncing
for table entries to only the nodes which participate in the
network to which the gossip belongs. This designs make the
gossip layer scale better and only consumes resources for the
network state that the node participates in.

Since the complete state for a network is maintained by all nodes
participating in the network, all nodes will eventually converge
to the same state.

NetworkDB also provides facilities for the users of the package to
watch on any table (or all tables) and get notified if there are
state changes of interest that happened anywhere in the cluster when
that state change eventually finds it's way to the watcher's node.

Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
2016-04-08 12:58:09 -07:00