It is sufficient to check only if network is available
in store to make the decision of whether to retain the
stale sandbox. If the endpoints are not available then
there is no point in retaining the sandbox anyways. This
fixes some extreme corner cases, where daemon goes down
right in the middle of sandbox cleanup happening.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
If the endpoint and the corresponding network is
not persistent then skip adding it into sandbox
store.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
At times, when checkpointed sandbox from store cannot be
cleaned up properly we still retain the sandbox in both
the store and in memory. But this sandbox store may not
contain important configuration information from docker.
So when docker requests a new sandbox, instead of using
it as is, reconcile the sandbox state from store with the
the configuration information provided by docker. To do this
mark the sandbox from store as stub and never reveal it to
external searches. When docker requests a new sandbox, update
the stub sandbox and clear the stub flag.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
There is a race in os sandbox sharing code where two containers which
are sharing the os sandbox try to recreate the os sandbox again which
might result in destroying the os sandbox and recreating it. Since the
os sandbox sharing is happening only for default sandbox, refactored the
code to create os sandbox only once inside a `sync.Once` api so that it
happens exactly once and gets reused by other containers. Also disabled
deleting this os sandbox.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Since we share the host sandbox with many containers we
need to serialize creation of the sandbox. Otherwise
container starts may see the namespace path in inconsistent
state.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
For ungraceful daemon restarts, libnetwork has sandbox cleanup logic to
remove any stale & dangling resources. But, if the store is down during
the daemon restart, then the cleanup logic would not be able to perform
complete cleanup. During such cases, the sandbox has been removed. With
this fix, we retain the sandbox if the store is down and the endpoint
couldnt be cleaned. When the container is later restarted in docker
daemon, we will perform a sandbox cleanup and that will complete the
cleanup round.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Added IT cases for external connectivity check for bridge
and overlay networks, both initially and after a restart.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Reconciling persistent state after configuring driver. If not
the networks will not be initialized properly based on certain
driver config settings like enabling IP tables etc.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Added an IT case for checking proper /etc/hosts
handling in the overlay network. This also to see
if there are any stale entries in the /etc/hosts
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
Cleanup the service db for the network when the last
container on the network leaves on the host. This is
because we stop watching the network after the last
container leaves and so if we keep the service db
around it might be kept uptodate with containers
joining and leaving in other hosts. The service
db will populated properly when a container joins
this network at a later point in time.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
- Currently when a sandbox disconnect from a network
the network's services are not removed from the
sandbox's /etc/hosts file
Signed-off-by: Alessandro Boch <aboch@docker.com>
Overlay driver allows local containers to communicate in overly network
even when the serf is not fully inited. But when the container leaves an
overlay network, it gets stuck waiting on a nil notifyCh, when the serf
is not fully initialized.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Currently the local containers of a global scope
network will get it's service records updated
from both a local update and global update. There
is no way to check if this is a local endpoint when
a remote update comes in via watch because we add
the endpoint to local endpoint list during join, while
the remote update happens during createendpoint.
The right thing to do is update the local endpoint list
and start watching during createndpoint and remove the watch
during delete endpoint. But this might result in the container
getting it's own record in it's /etc/hosts. So added a filtering
logic to filter out self records when updating the container's
/etc/hosts
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>