container: protect health monitor channel

While this code was likely called from a single thread before, we have
now seen panics, indicating that it could be called in parallel. This
change adds a mutex to protect opening and closing of the channel. There
may be another root cause associated with this panic, such as something
that led to the calling of this in parallel, as this code is old and we
had seen this condition until recently.

This fix is by no means a permanent fix. Typically, bugs like this
indicate misplaced channel ownership. In idiomatic uses, the channel
should have a particular "owner" that coordinates sending and closure.
In this case, the owner of the channel is unclear, so it gets opened
lazily. Synchronizing this access is a decent solution, but a refactor
may yield better results.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
This commit is contained in:
Stephen J Day 2017-11-13 13:31:28 -08:00
parent aea31ab242
commit 5b55747a52
No known key found for this signature in database
GPG Key ID: 67B3DED84EDC823F
1 changed files with 11 additions and 2 deletions

View File

@ -1,6 +1,8 @@
package container
import (
"sync"
"github.com/docker/docker/api/types"
"github.com/sirupsen/logrus"
)
@ -9,6 +11,7 @@ import (
type Health struct {
types.Health
stop chan struct{} // Write struct{} to stop the monitor
mu sync.Mutex
}
// String returns a human-readable description of the health-check state
@ -26,9 +29,12 @@ func (s *Health) String() string {
}
}
// OpenMonitorChannel creates and returns a new monitor channel. If there already is one,
// it returns nil.
// OpenMonitorChannel creates and returns a new monitor channel. If there
// already is one, it returns nil.
func (s *Health) OpenMonitorChannel() chan struct{} {
s.mu.Lock()
defer s.mu.Unlock()
if s.stop == nil {
logrus.Debug("OpenMonitorChannel")
s.stop = make(chan struct{})
@ -39,6 +45,9 @@ func (s *Health) OpenMonitorChannel() chan struct{} {
// CloseMonitorChannel closes any existing monitor channel.
func (s *Health) CloseMonitorChannel() {
s.mu.Lock()
defer s.mu.Unlock()
if s.stop != nil {
logrus.Debug("CloseMonitorChannel: waiting for probe to stop")
close(s.stop)