libnetwork/overlay:fix join sandbox deadlock

Operations performed on overlay network sandboxes are handled by
dispatching operations send through a channel. This allows for
asynchronous operations to be performed which, since they are
not called from within another function, are able to operate in
an idempotent manner with a known/measurable starting state from
which an identical series of iterative actions can be performed.

However, it was possible in some cases for an operation dispatched
from this channel to write a message back to the channel in the
case of joining a network when a sufficient volume of sandboxes
were operated on.

A goroutine which is simultaneously reading and writing to an
unbuffered channel can deadlock if it sends a message to a channel
then waits for it to be consumed and completed, since the only
available goroutine is more or less "talking to itself". In order
to break this deadlock, in the observed race, a goroutine is now
created to send the message to the channel.

Signed-off-by: Martin Dojcak <martin.dojcak@lablabs.io>
Signed-off-by: Ryan Barry <rbarry@mirantis.com>
This commit is contained in:
Martin Dojcak 2021-09-12 18:05:34 +02:00 committed by Ryan Barry
parent d5d5f258df
commit feab0cca9f
1 changed files with 1 additions and 1 deletions

View File

@ -319,7 +319,7 @@ func (n *network) joinSandbox(s *subnet, restore bool, incJoinCount bool) error
defer func() {
n.Unlock()
if doInitPeerDB {
n.driver.initSandboxPeerDB(n.id)
go n.driver.initSandboxPeerDB(n.id)
}
}()