Wait for container exit before forcing handler

This code assumes that we missed an exit event since the container is
still marked as running in Docker but attempts to signal the process in
containerd returns a "process not found" error.

There is a case where the event wasn't missed, just that it hasn't been
processed yet.

This change tries to work around that possibility by waiting to see if
the container is eventually marked as stopped. It uses the container's
configured stop timeout for this.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This commit is contained in:
Brian Goff 2020-08-11 20:13:00 +00:00
parent c997a4995d
commit 7fd23345c9
1 changed files with 13 additions and 1 deletions

View File

@ -99,7 +99,19 @@ func (daemon *Daemon) killWithSignal(container *containerpkg.Container, sig int)
if errdefs.IsNotFound(err) {
unpause = false
logrus.WithError(err).WithField("container", container.ID).WithField("action", "kill").Debug("container kill failed because of 'container not found' or 'no such process'")
go daemon.handleContainerExit(container, nil)
go func() {
// We need to clean up this container but it is possible there is a case where we hit here before the exit event is processed
// but after it was fired off.
// So let's wait the container's stop timeout amount of time to see if the event is eventually processed.
// Doing this has the side effect that if no event was ever going to come we are waiting a a longer period of time uneccessarily.
// But this prevents race conditions in processing the container.
ctx, cancel := context.WithTimeout(context.TODO(), time.Duration(container.StopTimeout())*time.Second)
defer cancel()
s := <-container.Wait(ctx, containerpkg.WaitConditionNotRunning)
if s.Err() != nil {
daemon.handleContainerExit(container, nil)
}
}()
} else {
return errors.Wrapf(err, "Cannot kill container %s", container.ID)
}