When daemon.ContainerLogs() is called with options.follow=true
(as in "docker logs --follow"), the "loggerutils.followLogs()"
function never returns (even then the logs consumer is gone).
As a result, all the resources associated with it (including
an opened file descriptor for the log file being read, two FDs
for a pipe, and two FDs for inotify watch) are never released.
If this is repeated (such as by running "docker logs --follow"
and pressing Ctrl-C a few times), this results in DoS caused by
either hitting the limit of inotify watches, or the limit of
opened files. The only cure is daemon restart.
Apparently, what happens is:
1. logs producer (a container) is gone, calling (*LogWatcher).Close()
for all its readers (daemon/logger/jsonfilelog/jsonfilelog.go:175).
2. WatchClose() is properly handled by a dedicated goroutine in
followLogs(), cancelling the context.
3. Upon receiving the ctx.Done(), the code in followLogs()
(daemon/logger/loggerutils/logfile.go#L626-L638) keeps to
send messages _synchronously_ (which is OK for now).
4. Logs consumer is gone (Ctrl-C is pressed on a terminal running
"docker logs --follow"). Method (*LogWatcher).Close() is properly
called (see daemon/logs.go:114). Since it was called before and
due to to once.Do(), nothing happens (which is kinda good, as
otherwise it will panic on closing a closed channel).
5. A goroutine (see item 3 above) keeps sending log messages
synchronously to the logWatcher.Msg channel. Since the
channel reader is gone, the channel send operation blocks forever,
and resource cleanup set up in defer statements at the beginning
of followLogs() never happens.
Alas, the fix is somewhat complicated:
1. Distinguish between close from logs producer and logs consumer.
To that effect,
- yet another channel is added to LogWatcher();
- {Watch,}Close() are renamed to {Watch,}ProducerGone();
- {Watch,}ConsumerGone() are added;
*NOTE* that ProducerGone()/WatchProducerGone() pair is ONLY needed
in order to stop ConsumerLogs(follow=true) when a container is stopped;
otherwise we're not interested in it. In other words, we're only
using it in followLogs().
2. Code that was doing (logWatcher*).Close() is modified to either call
ProducerGone() or ConsumerGone(), depending on the context.
3. Code that was waiting for WatchClose() is modified to wait for
either ConsumerGone() or ProducerGone(), or both, depending on the
context.
4. followLogs() are modified accordingly:
- context cancellation is happening on WatchProducerGone(),
and once it's received the FileWatcher is closed and waitRead()
returns errDone on EOF (i.e. log rotation handling logic is disabled);
- due to this, code that was writing synchronously to logWatcher.Msg
can be and is removed as the code above it handles this case;
- function returns once ConsumerGone is received, freeing all the
resources -- this is the bugfix itself.
While at it,
1. Let's also remove the ctx usage to simplify the code a bit.
It was introduced by commit a69a59ffc7 ("Decouple removing the
fileWatcher from reading") in order to fix a bug. The bug was actually
a deadlock in fsnotify, and the fix was just a workaround. Since then
the fsnofify bug has been fixed, and a new fsnotify was vendored in.
For more details, please see
https://github.com/moby/moby/pull/27782#issuecomment-416794490
2. Since `(*filePoller).Close()` is fixed to remove all the files
being watched, there is no need to explicitly call
fileWatcher.Remove(name) anymore, so get rid of the extra code.
Should fix https://github.com/moby/moby/issues/37391
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Docker daemon has a 16K buffer for log messages. If a message length
exceeds 16K, it should be split by the logger and merged at the
endpoint.
This change adds `PartialLogMetaData` struct for enhanced partial support
- LastPartial (bool) : indicates if this is the last of all partials.
- ID (string) : unique 32 bit ID. ID is same across all partials.
- Ordinal (int starts at 1) : indicates the position of msg in the series of partials.
Also, the timestamps across partials in the same.
Signed-off-by: Anusha Ragunathan <anusha.ragunathan@docker.com>
Make the `*RotateFileWriter` specifically about writing
`logger.Message`'s, which is what it's used for.
This allows for future changes where the log writer can cache details
about log entries such as (e.g.) the timestamps included in a particular
log file, which can be used to optimize reads.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Log drivers may have an internal buffer size that can be accommodated
by the copier as it is more effective to buffer and send fewer though
larger messages that the log driver can consume.
This eliminates the need for Partial handling for drivers that do not
support the concept (ie: awslogs, which can only have events up to
service limits).
Signed-off-by: Jacob Vallejo <jakeev@amazon.com>
Use strongly typed errors to set HTTP status codes.
Error interfaces are defined in the api/errors package and errors
returned from controllers are checked against these interfaces.
Errors can be wraeped in a pkg/errors.Causer, as long as somewhere in the
line of causes one of the interfaces is implemented. The special error
interfaces take precedence over Causer, meaning if both Causer and one
of the new error interfaces are implemented, the Causer is not
traversed.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Having a map per log entry seemed heavier than necessary. These
attributes end up being sorted and serialized, so storing them in a map
doesn't add anything (there's no random access element). In SwarmKit,
they originate as a slice, so there's an unnecessary conversion to a map
and back.
This also fixes the sort comparator, which used to inefficiently split
the string on each comparison.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Logging plugins use the same HTTP interface as other plugins for basic
command operations meanwhile actual logging operations are handled (on
Unix) via a fifo.
The plugin interface looks like so:
```go
type loggingPlugin interface {
StartLogging(fifoPath string, loggingContext Context) error
StopLogging(fifoPath)
```
This means a plugin must implement `LoggingDriver.StartLogging` and
`LoggingDriver.StopLogging` endpoints and be able to consume the passed
in fifo.
Logs are sent via stream encoder to the fifo encoded with protobuf.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Refactor container logs system to make communicating log messages
internally much simpler. Move responsibility for marshalling log
messages into the REST server. Support TTY logs. Pave the way for fixing
the ambiguous bytestream format. Pave the way for fixing details.
Signed-off-by: Drew Erny <drew.erny@docker.com>
This reduces allocs and bytes used per log entry significantly as well
as some improvement to time per log operation.
Each log driver, however, must put messages back in the pool once they
are finished with the message.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This allows the user to set a logging mode to "blocking" (default), or
"non-blocking", which uses the ring buffer as a proxy to the real log
driver.
This allows a container to never be blocked on stdio at the cost of
dropping log messages.
Introduces 2 new log-opts that works for all drivers, `log-mode` and
`log-size`. `log-mode` takes a value of "blocking", or "non-blocking"
I chose not to implement this as a bool since it is difficult to
determine if the mode was set to false vs just not set... especially
difficult when merging the default daemon config with the container config.
`log-size` takes a size string, e.g. `2MB`, which sets the max size
of the ring buffer. When the max size is reached, it will start
dropping log messages.
```
BenchmarkRingLoggerThroughputNoReceiver-8 2000000000 36.2 ns/op 856.35 MB/s 0 B/op 0 allocs/op
BenchmarkRingLoggerThroughputWithReceiverDelay0-8 300000000 156 ns/op 198.48 MB/s 32 B/op 0 allocs/op
BenchmarkRingLoggerThroughputConsumeDelay1-8 2000000000 36.1 ns/op 857.80 MB/s 0 B/op 0 allocs/op
BenchmarkRingLoggerThroughputConsumeDelay10-8 1000000000 36.2 ns/op 856.53 MB/s 0 B/op 0 allocs/op
BenchmarkRingLoggerThroughputConsumeDelay50-8 2000000000 34.7 ns/op 894.65 MB/s 0 B/op 0 allocs/op
BenchmarkRingLoggerThroughputConsumeDelay100-8 2000000000 35.1 ns/op 883.91 MB/s 0 B/op 0 allocs/op
BenchmarkRingLoggerThroughputConsumeDelay300-8 1000000000 35.9 ns/op 863.90 MB/s 0 B/op 0 allocs/op
BenchmarkRingLoggerThroughputConsumeDelay500-8 2000000000 35.8 ns/op 866.88 MB/s 0 B/op 0 allocs/op
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
The `docker logs` command performed a
client-side check if the container's
logging driver was supported.
Now that we allow the client to connect
to both "older" and "newer" daemon versions,
this check is best done daemon-side.
This patch remove the check on the client
side, and leaves validation to the daemon,
which should be the source of truth.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
This change updates how we handle long lines of output from the
container. The previous logic used a bufio reader to read entire lines
of output from the container through an intermediate BytesPipe, and that
allowed the container to cause dockerd to consume an unconstrained
amount of memory as it attempted to collect a whole line of output, by
outputting data without newlines.
To avoid that, we replace the bufio reader with our own buffering scheme
that handles log lines up to 16k in length, breaking up anything longer
than that into multiple chunks. If we can dispense with noting this
detail properly at the end of output, we can switch from using
ReadBytes() to using ReadLine() instead. We add a field ("Partial") to
the log message structure to flag when we pass data to the log driver
that did not end with a newline.
The Line member of Message structures that we pass to log drivers is now
a slice into data which can be overwritten between calls to the log
driver's Log() method, so drivers which batch up Messages before
processing them need to take additional care: we add a function
(logger.CopyMessage()) that can be used to create a deep copy of a
Message structure, and modify the awslogs driver to use it.
We update the jsonfile log driver to append a "\n" to the data that it
logs to disk only when the Partial flag is false (it previously did so
unconditionally), to make its "logs" output correctly reproduce the data
as we received it.
Likewise, we modify the journald log driver to add a data field with
value CONTAINER_PARTIAL_MESSAGE=true to entries when the Partial flag is
true, and update its "logs" reader to refrain from appending a "\n" to
the data that it retrieves if it does not see this field/value pair (it
also previously did this unconditionally).
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com> (github: nalind)
Log drivers are instantiated on a per-container basis, and passed the
container ID (along with other information) when they're initialized.
Drivers that care about that value are caching the value that they're
passed when they're initialized and using it in favor of the value
contained in Message structures that are passed to them, so the field in
Messages is unused, so we remove it.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
The jsonlog logger currently allows specifying envs and labels that
should be propagated to the log message, however there has been no way
to read that back.
This adds a new API option to enable inserting these attrs back to the
log reader.
With timestamps, this looks like so:
```
92016-04-08T15:28:09.835913720Z foo=bar,hello=world hello
```
The extra attrs are comma separated before the log message but after
timestamps.
Without timestaps it looks like so:
```
foo=bar,hello=world hello
```
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
- Move time json marshaling to the jsonlog package: this is a docker
internal hack that we should not promote as a library.
- Move Timestamp encoding/decoding functions to the API types: This is
only used there. It could be a standalone library but I don't this
it's worth having a separated repo for this. It could introduce more
complexity than it solves.
Signed-off-by: David Calavera <david.calavera@gmail.com>
After tailing a file, if the number of lines requested is > the number
of lines in the file, this would cause a json unmarshalling error to
occur when we later try to go follow the file.
So brute force set it to the end if any tailing occurred.
There is potential that there could be some missing log messages if logs
are being written very quickly, however I was not able to make this
happen even with `while true; do echo hello; done`, so this is probably
acceptable.
While testing this I also found a panic in LogWatcher.Close can be
called twice due to a race. Fix channel close to only close when there
has been no signal to the channel.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
- downcase and privatize exported variables that were unused
- make accurate an error message
- added package comments
- remove unused var ReadLogsNotSupported
- enable linter
- some spelling corrections
Signed-off-by: Morgan Bauer <mbauer@us.ibm.com>
- noplog driver pkg for '--log-driver=none' (null object pattern)
- centralized factory for log drivers (instead of case/switch)
- logging drivers registers themselves to factory upon import
(easy plug/unplug of drivers in daemon/logdrivers.go)
- daemon now doesn't start with an invalid log driver
- Name() method of loggers is actually now their cli names (made it useful)
- generalized Read() logic, made it unsupported except json-file (preserves
existing behavior)
Spotted some duplication code around processing of legacy json-file
format, didn't touch that and refactored in both places.
Signed-off-by: Ahmet Alp Balkan <ahmetalpbalkan@gmail.com>