Commit Graph

20 Commits

Author SHA1 Message Date
Thomas Leonard b6c7becbfe
Add support for user-defined healthchecks
This PR adds support for user-defined health-check probes for Docker
containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus
some corresponding "docker run" options. It can be used with a restart policy
to automatically restart a container if the check fails.

The `HEALTHCHECK` instruction has two forms:

* `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container)
* `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image)

The `HEALTHCHECK` instruction tells Docker how to test a container to check that
it is still working. This can detect cases such as a web server that is stuck in
an infinite loop and unable to handle new connections, even though the server
process is still running.

When a container has a healthcheck specified, it has a _health status_ in
addition to its normal status. This status is initially `starting`. Whenever a
health check passes, it becomes `healthy` (whatever state it was previously in).
After a certain number of consecutive failures, it becomes `unhealthy`.

The options that can appear before `CMD` are:

* `--interval=DURATION` (default: `30s`)
* `--timeout=DURATION` (default: `30s`)
* `--retries=N` (default: `1`)

The health check will first run **interval** seconds after the container is
started, and then again **interval** seconds after each previous check completes.

If a single run of the check takes longer than **timeout** seconds then the check
is considered to have failed.

It takes **retries** consecutive failures of the health check for the container
to be considered `unhealthy`.

There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list
more than one then only the last `HEALTHCHECK` will take effect.

The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK
CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands;
see e.g. `ENTRYPOINT` for details).

The command's exit status indicates the health status of the container.
The possible values are:

- 0: success - the container is healthy and ready for use
- 1: unhealthy - the container is not working correctly
- 2: starting - the container is not ready for use yet, but is working correctly

If the probe returns 2 ("starting") when the container has already moved out of the
"starting" state then it is treated as "unhealthy" instead.

For example, to check every five minutes or so that a web-server is able to
serve the site's main page within three seconds:

    HEALTHCHECK --interval=5m --timeout=3s \
      CMD curl -f http://localhost/ || exit 1

To help debug failing probes, any output text (UTF-8 encoded) that the command writes
on stdout or stderr will be stored in the health status and can be queried with
`docker inspect`. Such output should be kept short (only the first 4096 bytes
are stored currently).

When the health status of a container changes, a `health_status` event is
generated with the new status. The health status is also displayed in the
`docker ps` output.

Signed-off-by: Thomas Leonard <thomas.leonard@docker.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2016-06-02 23:58:34 +02:00
David Calavera a793564b25 Remove static errors from errors package.
Moving all strings to the errors package wasn't a good idea after all.

Our custom implementation of Go errors predates everything that's nice
and good about working with errors in Go. Take as an example what we
have to do to get an error message:

```go
func GetErrorMessage(err error) string {
	switch err.(type) {
	case errcode.Error:
		e, _ := err.(errcode.Error)
		return e.Message

	case errcode.ErrorCode:
		ec, _ := err.(errcode.ErrorCode)
		return ec.Message()

	default:
		return err.Error()
	}
}
```

This goes against every good practice for Go development. The language already provides a simple, intuitive and standard way to get error messages, that is calling the `Error()` method from an error. Reinventing the error interface is a mistake.

Our custom implementation also makes very hard to reason about errors, another nice thing about Go. I found several (>10) error declarations that we don't use anywhere. This is a clear sign about how little we know about the errors we return. I also found several error usages where the number of arguments was different than the parameters declared in the error, another clear example of how difficult is to reason about errors.

Moreover, our custom implementation didn't really make easier for people to return custom HTTP status code depending on the errors. Again, it's hard to reason about when to set custom codes and how. Take an example what we have to do to extract the message and status code from an error before returning a response from the API:

```go
	switch err.(type) {
	case errcode.ErrorCode:
		daError, _ := err.(errcode.ErrorCode)
		statusCode = daError.Descriptor().HTTPStatusCode
		errMsg = daError.Message()

	case errcode.Error:
		// For reference, if you're looking for a particular error
		// then you can do something like :
		//   import ( derr "github.com/docker/docker/errors" )
		//   if daError.ErrorCode() == derr.ErrorCodeNoSuchContainer { ... }

		daError, _ := err.(errcode.Error)
		statusCode = daError.ErrorCode().Descriptor().HTTPStatusCode
		errMsg = daError.Message

	default:
		// This part of will be removed once we've
		// converted everything over to use the errcode package

		// FIXME: this is brittle and should not be necessary.
		// If we need to differentiate between different possible error types,
		// we should create appropriate error types with clearly defined meaning
		errStr := strings.ToLower(err.Error())
		for keyword, status := range map[string]int{
			"not found":             http.StatusNotFound,
			"no such":               http.StatusNotFound,
			"bad parameter":         http.StatusBadRequest,
			"conflict":              http.StatusConflict,
			"impossible":            http.StatusNotAcceptable,
			"wrong login/password":  http.StatusUnauthorized,
			"hasn't been activated": http.StatusForbidden,
		} {
			if strings.Contains(errStr, keyword) {
				statusCode = status
				break
			}
		}
	}
```

You can notice two things in that code:

1. We have to explain how errors work, because our implementation goes against how easy to use Go errors are.
2. At no moment we arrived to remove that `switch` statement that was the original reason to use our custom implementation.

This change removes all our status errors from the errors package and puts them back in their specific contexts.
IT puts the messages back with their contexts. That way, we know right away when errors used and how to generate their messages.
It uses custom interfaces to reason about errors. Errors that need to response with a custom status code MUST implementent this simple interface:

```go
type errorWithStatus interface {
	HTTPErrorStatusCode() int
}
```

This interface is very straightforward to implement. It also preserves Go errors real behavior, getting the message is as simple as using the `Error()` method.

I included helper functions to generate errors that use custom status code in `errors/errors.go`.

By doing this, we remove the hard dependency we have eeverywhere to our custom errors package. Yes, you can use it as a helper to generate error, but it's still very easy to generate errors without it.

Please, read this fantastic blog post about errors in Go: http://dave.cheney.net/2014/12/24/inspecting-errors

Signed-off-by: David Calavera <david.calavera@gmail.com>
2016-02-26 15:49:09 -05:00
Zhang Wei 894266c1bb Remove redundant error message
Currently some commands including `kill`, `pause`, `restart`, `rm`,
`rmi`, `stop`, `unpause`, `udpate`, `wait` will print a lot of error
message on client side, with a lot of redundant messages, this commit is
trying to remove the unuseful and redundant information for user.

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
2016-02-03 15:45:20 +08:00
Lei Jitang 6716a3a167 Correct the info message when stop container
Signed-off-by: Lei Jitang <leijitang@huawei.com>
2016-01-27 03:06:45 -05:00
David Calavera d7d512bb92 Rename `Daemon.Get` to `Daemon.GetContainer`.
This is more aligned with `Daemon.GetImage` and less confusing.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-12-11 12:39:28 -05:00
David Calavera 6bb0d1816a Move Container to its own package.
So other packages don't need to import the daemon package when they
want to use this struct.

Signed-off-by: David Calavera <david.calavera@gmail.com>
Signed-off-by: Tibor Vass <tibor@docker.com>
2015-12-03 17:39:49 +01:00
David Calavera ca5ede2d0a Decouple daemon and container to log events.
Create a supervisor interface to let the container monitor to emit events.

Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-11-04 12:27:48 -05:00
David Calavera 4f2a5ba360 Decouple daemon and container to stop and kill containers.
Signed-off-by: David Calavera <david.calavera@gmail.com>
2015-11-04 12:27:47 -05:00
Tibor Vass b08f071e18 Revert "Merge pull request #16228 from duglin/ContextualizeEvents"
Although having a request ID available throughout the codebase is very
valuable, the impact of requiring a Context as an argument to every
function in the codepath of an API request, is too significant and was
not properly understood at the time of the review.

Furthermore, mixing API-layer code with non-API-layer code makes the
latter usable only by API-layer code (one that has a notion of Context).

This reverts commit de41640435, reversing
changes made to 7daeecd42d.

Signed-off-by: Tibor Vass <tibor@docker.com>

Conflicts:
	api/server/container.go
	builder/internals.go
	daemon/container_unix.go
	daemon/create.go
2015-09-29 14:26:51 -04:00
Doug Davis 26b1064967 Add context.RequestID to event stream
This PR adds a "request ID" to each event generated, the 'docker events'
stream now looks like this:

```
2015-09-10T15:02:50.000000000-07:00 [reqid: c01e3534ddca] de7c5d4ca927253cf4e978ee9c4545161e406e9b5a14617efb52c658b249174a: (from ubuntu) create
```
Note the `[reqID: c01e3534ddca]` part, that's new.

Each HTTP request will generate its own unique ID. So, if you do a
`docker build` you'll see a series of events all with the same reqID.
This allow for log processing tools to determine which events are all related
to the same http request.

I didn't propigate the context to all possible funcs in the daemon,
I decided to just do the ones that needed it in order to get the reqID
into the events. I'd like to have people review this direction first, and
if we're ok with it then I'll make sure we're consistent about when
we pass around the context - IOW, make sure that all funcs at the same level
have a context passed in even if they don't call the log funcs - this will
ensure we're consistent w/o passing it around for all calls unnecessarily.

ping @icecrime @calavera @crosbymichael

Signed-off-by: Doug Davis <dug@us.ibm.com>
2015-09-24 11:56:37 -07:00
Doug Davis a283a30fb0 Move api/errors/ to errors/
Per @calavera's suggestion: https://github.com/docker/docker/pull/16355#issuecomment-141139220

Signed-off-by: Doug Davis <dug@us.ibm.com>
2015-09-17 11:54:14 -07:00
Doug Davis f7d4b4fe2b Convert some "daemon" static error strings to the new errocode package format
Signed-off-by: Doug Davis <dug@us.ibm.com>
2015-09-16 16:16:42 -07:00
Morgan Bauer abd72d4008
golint fixes for daemon/ package
- some method names were changed to have a 'Locking' suffix, as the
 downcased versions already existed, and the existing functions simply
 had locks around the already downcased version.
 - deleting unused functions
 - package comment
 - magic numbers replaced by golang constants
 - comments all over

Signed-off-by: Morgan Bauer <mbauer@us.ibm.com>
2015-08-27 22:07:42 -07:00
Doug Davis 8232312c1e Cleanup container LogEvent calls
Move some calls to container.LogEvent down lower so that there's
less of a chance of them being missed. Also add a few more events
that appear to have been missed.

Added testcases for new events: commit, copy, resize, attach, rename, top

Signed-off-by: Doug Davis <dug@us.ibm.com>
2015-06-01 12:39:28 -07:00
Antonio Murdaca 04cc6c6aa4 Remove job from stop
Signed-off-by: Antonio Murdaca <me@runcom.ninja>
2015-04-12 00:41:16 +02:00
Antonio Murdaca c79b9bab54 Remove engine.Status and replace it with standard go error
Signed-off-by: Antonio Murdaca <me@runcom.ninja>
2015-03-25 22:32:08 +01:00
Andrew C. Bodine d25a65375c Closes #9311 Handles container id/name collisions against daemon functionalities according to #8069
Signed-off-by: Andrew C. Bodine <acbodine@us.ibm.com>
2015-01-21 17:11:31 -08:00
Alexandr Morozov e0339d4b88
Use State as embedded to Container
Signed-off-by: Alexandr Morozov <lk4d4math@gmail.com>
2014-09-03 00:01:11 +04:00
Alexandr Morozov 8d056423f8 Separate events subsystem
* Events subsystem merged from `server/events.go` and
  `utils/jsonmessagepublisher.go` and moved to `events/events.go`
* Only public interface for this subsystem is engine jobs
* There is two new engine jobs - `log_event` and `subscribers_count`
* There is auxiliary function `container.LogEvent` for logging events for
  containers

Docker-DCO-1.1-Signed-off-by: Alexandr Morozov <lk4d4math@gmail.com> (github: LK4D4)
[solomon@docker.com: resolve merge conflicts]
Signed-off-by: Solomon Hykes <solomon@docker.com>
2014-08-06 10:08:19 +00:00
Solomon Hykes 03c07617cc Move "stop" to daemon/stop.go
This is part of an effort to break apart the deprecated server/ package

Docker-DCO-1.1-Signed-off-by: Solomon Hykes <solomon@docker.com> (github: shykes)
2014-08-01 14:16:59 -04:00