The first issue is an ordering problem where sandbox
attached version of endpoint object should be pushed
to the watch database first so that any other create endpoint
which is in progress can make use of it immediately to update
the container hosts file. And only after that the current
container should try to retrieve the service records from the
service data base and upate it's hosts file. With the previous
order there is a small time window, when another endpoint create
will find this endpoint but it doesn't have the sandbox context
while the svc record population from svc db has already happened
so that container will totally miss to populate the service record
of the newly created endpoint.
The second issue is trying to rebuild the /etc/hosts file from scratch
during endpoint join and this may sometimes happen after the service
record add for another endpoint has happened on the container
file. Obviously this rebuilding will wipe out that service record which
was just added. Removed the rebuilding of /etc/hosts file during
endpoint join. The initial population of /etc/hosts file should only
happen during sandbox creation time. In the endpoint join just added
the backward-compatible self ip -> hostname entry as just another
record.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
It is sufficient to check only if network is available
in store to make the decision of whether to retain the
stale sandbox. If the endpoints are not available then
there is no point in retaining the sandbox anyways. This
fixes some extreme corner cases, where daemon goes down
right in the middle of sandbox cleanup happening.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
At times, when checkpointed sandbox from store cannot be
cleaned up properly we still retain the sandbox in both
the store and in memory. But this sandbox store may not
contain important configuration information from docker.
So when docker requests a new sandbox, instead of using
it as is, reconcile the sandbox state from store with the
the configuration information provided by docker. To do this
mark the sandbox from store as stub and never reveal it to
external searches. When docker requests a new sandbox, update
the stub sandbox and clear the stub flag.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
There is a race in os sandbox sharing code where two containers which
are sharing the os sandbox try to recreate the os sandbox again which
might result in destroying the os sandbox and recreating it. Since the
os sandbox sharing is happening only for default sandbox, refactored the
code to create os sandbox only once inside a `sync.Once` api so that it
happens exactly once and gets reused by other containers. Also disabled
deleting this os sandbox.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
For ungraceful daemon restarts, libnetwork has sandbox cleanup logic to
remove any stale & dangling resources. But, if the store is down during
the daemon restart, then the cleanup logic would not be able to perform
complete cleanup. During such cases, the sandbox has been removed. With
this fix, we retain the sandbox if the store is down and the endpoint
couldnt be cleaned. When the container is later restarted in docker
daemon, we will perform a sandbox cleanup and that will complete the
cleanup round.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Introduced a path level lock to synchronize updates
to /etc/hosts writes. A path level cache is maintained
to only synchronize only at the file level.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
When the daemon has a lot of containers and even when
the daemon tries to give 15 second to stop all containers
it is not enough. So the daemon forces a shut down at the end
of 15 seconds. And hence in a situation with a lot of
containers even gracefully bringing down the daemon will result
in a lot of containers fully not brought down.
In addition to this the daemon force killing itself can happen
in any arbitrary point in time which will result in inconsistent
checkpointed state for the sandbox. This makes the cleanup really
fail when we come back up and in many cases because of this
inability to cleanup properly on restart will result in daemon not
able to restart because we are not able to delete the default network.
This commit ensures that the sandbox state stored in the disk is
never inconsistent so that when we come back up we will always be
able to cleanup the sandbox state.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently when container has a restart policy and gets
restarted, docker does not release networking and allocate
it back. But it presents libnetwork with a new sandbox while
all the network resources are locked in the old sandbox. This
commit attempts to move all the network resources from the old
sandbox to the new sandbox when libnetwork is presented with the
new sandbox.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently when docker exits ungracefully it may leave
dangling sandboxes which may hold onto precious network
resources. Added checkpoint state for sandboxes which
on boot up will be used to clean up the sandboxes and
network resources.
On bootup the remaining dangling state in the checkpoint
are read and cleaned up before accepting any new
network allocation requests.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Always on watching of networks and endpoints can
affect scalability of the cluster beyond a few nodes.
Remove pro active watching and watch only the objects
you are interested in.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Exposing osl package outside libnetwork is not neccessary and the
InterfaceStatistics anyways belong to the types package.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Currently on every endpoint Join the /etc/hosts
file is getting overwritten. This blows the already
existing service records. Modify the `updateHostsFile`
function to build the hosts file only on the first
endpoint join and for subsequent joins just update
the existing /etc/hosts file with the additional
network specific service records.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Currently the endpoint data model consists of multiple
interfaces per-endpoint. This seems to be an overkill
since there is no real use case for it. Removing it
to remove unnecessary complexity from the code.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Convinience API which detaches the sandbox from
all endpoints, resets and reapply config options,
setup discovery files, reattach to the endpoints.
No change to the osl sandbox in use.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Two issues:
- container resolv.conf getting regenerated even when no dns configs are passed
- updateHosts should be skipped for host networking mode
- incorrect check on dnsOptions
Signed-off-by: Alessandro Boch <aboch@docker.com>
- Maps 1 to 1 with container's networking stack
- It holds container's specific nw options which
before were incorrectly owned by Endpoint.
- Sandbox creation no longer coupled with Endpoint Join,
sandbox and endpoint have now separate lifecycle.
- LeaveAll naturally replaced by Sandbox.Delete
- some pkg and file renaming in order to have clear
mapping between structure name and entity ("sandbox")
- Revisited hosts and resolv.conf handling
- Removed from JoinInfo interface capability of setting hosts and resolv.conf paths
- Changed etchosts.Build() to first write the search domains and then the nameservers
Signed-off-by: Alessandro Boch <aboch@docker.com>