moby--moby

mirror of https://github.com/moby/moby.git synced 2022-11-09 12:21:53 -05:00

Author	SHA1	Message	Date
Santhosh Manohar	31dd4362a8	Merge pull request #1542 from allencloud/change-reapNode-interval update reapNode interval	2016-11-08 11:14:23 -08:00
allencloud	0b4f68390d	remove unused mConfig Signed-off-by: allencloud <allen.sun@daocloud.io>	2016-11-08 18:18:55 +08:00
allencloud	99f84ff5a7	update reapNode interval Signed-off-by: allencloud <allen.sun@daocloud.io>	2016-11-08 15:28:42 +08:00
Santhosh Manohar	e98b152bac	Reap failed nodes after 24 hours Signed-off-by: Santhosh Manohar <santhosh@docker.com>	2016-10-20 11:24:04 -07:00
Santhosh Manohar	0a2537eea3	Use monotonic clock for reaping networkDB entries Signed-off-by: Santhosh Manohar <santhosh@docker.com>	2016-10-19 22:30:47 -07:00
Alexander Morozov	03088ace1b	networkdb: fix race in access to nodes len Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>	2016-10-04 12:19:25 -07:00
Jana Radhakrishnan	f649d5ae61	Do not hold ack channel in ack table after closing Once the bulksync ack channel is closed remove it from the ack table right away. There is no reason to keep it in the ack table and later delete it in the ack waiter. Ack waiter anyways has reference to the channel on which it is waiting. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-10-03 09:50:02 -07:00
Jana Radhakrishnan	22c322dded	Avoid returning early on agent join failures When a gossip join failure happens do not return early in the call chain because a join failure is most likely transient and the retry logic built in the networkdb is going to retry and succeed. Returning early makes the initialization of ingress network/sandbox to not happen which causes a problem even after the gossip join on retry is successful. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-09-27 08:36:10 -07:00
Jana Radhakrishnan	7b905d3c63	Purge stale nodes with same prefix and IP Since the node name randomization fix, we need to make sure that we purge the old node with the same prefix and same IP from the nodes database if it still present. This causes unnecessary reconnect attempts. Also added a change to avoid unnecessary update of local lamport time and only do it of we are ready to do a push pull on a join. Join should happen only when the node is bootstrapped or when trying to reconnect with a failed node. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-09-23 14:48:54 -07:00
Madhu Venugopal	d1f6eb1812	Allow the memberlist shutdown even if networkdb leave fails Signed-off-by: Madhu Venugopal <madhu@docker.com>	2016-09-23 05:19:07 -07:00
Jana Radhakrishnan	b0a7084c05	Honor user provided listen address for gossip If user provided a non-zero listen address, honor that and bind only to that address. Right now it is not honored and we always bind to all ip addresses in the host. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-09-22 11:41:57 -07:00
Jana Radhakrishnan	5f5dad3c02	Recover from transient gossip failures Currently if there is any transient gossip failure in any node the recoevry process depends on other nodes propogating the information indirectly. In cases if these transient failures affects all the nodes that this node has in its memberlist then this node will be permenantly cutoff from the the gossip channel. Added node state management code in networkdb to address these problems by trying to rejoin the cluster via the failed nodes when there is a failure. This also necessitates the need to add new messages called node event messages to differentiate between node leave and node failure. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-09-19 15:58:14 -07:00
Jana Radhakrishnan	2bead02c87	Ignore delete events for non-existent entries In networkdb we should ignore delete events for entries which doesn't exist in the db. This is always true because if the entry did not exist then the entry has been removed way earlier and got purged after the reap timer and this notification is very stale. Also there were duplicate delete notifications being sent to the clients. One when the actual delete event was received from gossip and later when the entry was getting reaped. The second notification is unnecessary and may cause issues with the clients if they are not coded for idempotency. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-08-18 13:57:24 -07:00
Santhosh Manohar	2bab9b6bdb	Cleanup networkdb state when the network is deleted locally Signed-off-by: Santhosh Manohar <santhosh@docker.com>	2016-08-10 12:44:05 -07:00
Madhu Venugopal	6368406c26	Adding Advertise-addr support With this change, all the auto-detection of the addresses are removed from libnetwork and the caller takes the responsibilty to have a proper advertise-addr in various scenarios (including externally facing public advertise-addr with an internal facing private listen-addr) Signed-off-by: Madhu Venugopal <madhu@docker.com>	2016-07-21 02:44:25 -07:00
Alexander Morozov	af3158ecdb	networkdb: do nothing in bulkSync if nodes is empty This patch allows getting rid of annoying debug message. Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>	2016-07-11 09:11:07 -07:00
Jana Radhakrishnan	8936daab5e	Retain deleted entries for longer time When deleting entries or when learning about deleted entries remember then for a longer time to avoid excessive delete duplicates in the gossip cluster. Also added code changes to ignore event messages originated from the source node so that it doesn't get added into the rebroadcast queue. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-06-30 18:24:13 -07:00
Santhosh Manohar	929921a640	Add debugs for key change events in networkdb Signed-off-by: Santhosh Manohar <santhosh@docker.com>	2016-06-14 03:13:48 -07:00
Jana Radhakrishnan	6538faa880	Do not bulk sync state which is getting deleted Bulk sync should not sync state which is getting deleted Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-06-18 17:58:51 -07:00
Jana Radhakrishnan	6034058dc3	Fix infinite loop in bulk sync Due to a slice management logic error the bulk sync for loop can go on indefinitely and eventually leading to an OOM error. Fixed the logic so that an infinite loop never occurs. Also changed the bulk sync wait timeout to use a timer rather than use time.After as time.After is known to consume a lot of memory when called in a tight loop. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-06-15 23:38:48 -07:00
Jana Radhakrishnan	f5f576ad34	Properly purge node networks when node goes away When a node goes away purge all the network attachments from the node and make sure we don't attempt bulk syncing to that node once removed. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-06-14 12:39:38 -07:00
Jana Radhakrishnan	3859a7e394	Make sure to notify watchers on node going away When a node goes away we purge all the table entries that we learned from that node but we don't notify the watchers about it. Made sure we notify the watchers when this happens. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-06-13 11:52:15 -07:00
Jana Radhakrishnan	fd72f6e318	Do not wait on ack in bulksync response The wait in bulkSyncNode was meant for bulkSync initiator. Not for responder. Fix the incorrect code which was also waiting unnecessarily on response which it will never get and will eventually time out. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-06-10 14:46:30 -07:00
Santhosh Manohar	b2b87577d4	Add support for encrypting gossip traffic Signed-off-by: Santhosh Manohar <santhosh@docker.com>	2016-06-04 03:55:14 -07:00
Jana Radhakrishnan	774399fd66	Fix couple of panics in networkdb Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-06-02 20:29:37 -07:00
Jana Radhakrishnan	77abea9c1e	Use protobuf in networkdb core messages Convert all networkdb core message types from go message types to protobuf message types. This faciliates future modification of the message structure without breaking backward compatibility. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-05-17 09:18:24 -07:00
Jana Radhakrishnan	0580043718	Add libnetwork agent mode support libnetwork agent mode is a mode where libnetwork can act as a local agent for network and discovery plumbing alone while the state management is done elsewhere. This completes the support for making libnetwork and its associated drivers to be completely independent of a k/v store(if needed) and work purely based on the state information passed along by some some external controller or manager. This does not mean that libnetwork support for decentralized state management via a k/v store is removed. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-05-02 18:19:32 -07:00
Jana Radhakrishnan	060aa49a70	Fix gossip network event overwriting self When a node joins a network it sends out a gossip event before it updates it's own in-memory state. This can create a race where the node gets the event back from a remote node before we update in-memory state and we treat that as latest state. To avoid this race, always generate the gossip after updating local state. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-04-25 09:47:36 -07:00
Jana Radhakrishnan	28f4561e3f	Add network scoped gossip database Network DB is a network scoped gossip database built on top of hashicorp/memberlist providing an eventually consistent state store. It limits the scope of the gossip and periodic bulk syncing for table entries to only the nodes which participate in the network to which the gossip belongs. This designs make the gossip layer scale better and only consumes resources for the network state that the node participates in. Since the complete state for a network is maintained by all nodes participating in the network, all nodes will eventually converge to the same state. NetworkDB also provides facilities for the users of the package to watch on any table (or all tables) and get notified if there are state changes of interest that happened anywhere in the cluster when that state change eventually finds it's way to the watcher's node. Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>	2016-04-08 12:58:09 -07:00

29 commits