From 156123226119f14070ce0e675466d231fe19189c Mon Sep 17 00:00:00 2001 From: Solomon Hykes Date: Sat, 4 May 2013 19:47:57 -0700 Subject: [PATCH] First draft of new README. Feedback and contributions welcome! --- README.md | 91 ++++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 73 insertions(+), 18 deletions(-) diff --git a/README.md b/README.md index 13ec817e2b..7b1e18f390 100644 --- a/README.md +++ b/README.md @@ -1,37 +1,92 @@ -Docker: the Linux container runtime -=================================== +Docker: the Linux container engine +================================== -Docker complements LXC with a high-level API which operates at the process level. It runs unix processes with strong guarantees of isolation and repeatability across servers. +Docker is an open-source engine which automates the deployment of applications as highly portable, self-sufficient containers. -Docker is a great building block for automating distributed systems: large-scale web deployments, database clusters, continuous deployment systems, private PaaS, service-oriented architectures, etc. +Docker containers are both *hardware-agnostic* and *platform-agnostic*. This means that they can run anywhere, from your +laptop to the largest EC2 compute instance and everything in between - and they don't require that you use a particular +language, framework or packaging system. That makes them great building blocks for deploying and scaling web apps, databases +and backend services without depending on a particular stack or provider. -![Docker L](docs/sources/static_files/lego_docker.jpg "Docker") +Docker is an open-source implementation of the deployment engine which powers [dotCloud](http://dotcloud.com), a popular Platform-as-a-Service. +It benefits directly from the experience accumulated over several years of large-scale operation and support of hundreds of thousands +of applications and databases. -* *Heterogeneous payloads*: any combination of binaries, libraries, configuration files, scripts, virtualenvs, jars, gems, tarballs, you name it. No more juggling between domain-specific tools. Docker can deploy and run them all. +## Better than VMs -* *Any server*: docker can run on any x64 machine with a modern linux kernel - whether it's a laptop, a bare metal server or a VM. This makes it perfect for multi-cloud deployments. +A common method for distributing applications and sandbox their execution is to use virtual machines, or VMs. Typical VM formats +are VMWare's vmdk, Oracle Virtualbox's vdi, and Amazon EC2's ami. In theory these formats should allow every developer to +automatically package their application into a "machine" for easy distribution and deployment. In practice, that almost never +happens, for a few reasons: -* *Isolation*: docker isolates processes from each other and from the underlying host, using lightweight containers. + * *Size*: VMs are very large which makes them impractical to store and transfer. + * *Performance*: running VMs consumes significant CPU and memory, which makes them impractical in many scenarios, for example local development of multi-tier applications, and + large-scale deployment of cpu and memory-intensive applications on large numbers of machines. + * *Portability*: competing VM environments don't play well with each other. Although conversion tools do exist, they are limited and add even more overhead. + * *Hardware-centric*: VMs were designed with machine operators in mind, not software developers. As a result, they offer very limited tooling for what developers need most: + building, testing and running their software. For example, VMs offer no facilities for application versioning, monitoring, configuration, logging or service discovery. -* *Repeatability*: because containers are isolated in their own filesystem, they behave the same regardless of where, when, and alongside what they run. +By contrast, Docker relies on a different sandboxing method known as *containerization*. Unlike traditional virtualization, +containerization takes place at the kernel level. Most modern operating system kernels now support the primitives necessary +for containerization, including Linux with [openvz](http://openvz.org), [vserver](http://linux-vserver.org) and more recently [lxc](http://lxc.sourceforge.net), + Solaris with [zones](http://docs.oracle.com/cd/E26502_01/html/E29024/preface-1.html#scrolltoc) and FreeBSD with [Jails](http://www.freebsd.org/doc/handbook/jails.html). + +Docker builds on top of these low-level primitives to offer developers a portable format and runtime environment that solves +all 4 problems. Docker containers are small (and their transfer can be optimized with layers), they have basically zero memory and cpu overhead, +the are completely portable and are designed from the ground up with an application-centric design. + +The best part: because docker operates at the OS level, it can still be run inside a VM! + +## Plays well with others + +Docker does not require that you buy into a particular programming language, framework, packaging system or configuration language. + +Is your application a unix process? Does it use files, tcp connections, environment variables, standard unix streams and command-line +arguments as inputs and outputs? Then docker can run it. + +Can your application's build be expressed a sequence of such commands? Then docker can build it. -Notable features ------------------ +## Escape dependency hell -* Filesystem isolation: each process container runs in a completely separate root filesystem. +A common problem for developers is the difficulty of managing all their application's dependencies in a simple and automated way. -* Resource isolation: system resources like cpu and memory can be allocated differently to each process container, using cgroups. +This is usually difficult for several reasons: -* Network isolation: each process container runs in its own network namespace, with a virtual interface and IP address of its own. + * *Cross-platform dependencies*. Modern applications often depend on a combination of system libraries and binaries, language-specific packages, framework-specific modules, + internal components developed for another project, etc. These dependencies live in different "worlds" and require different tools - these tools typically don't work + well with each other, requiring awkward custom integrations. -* Copy-on-write: root filesystems are created using copy-on-write, which makes deployment extremely fast, memory-cheap and disk-cheap. + * Conflicting dependencies. Different applications may depend on different versions of the same dependency. Packaging tools handle these situations with various degrees of ease - + but they all handle them in different and incompatible ways, which again forces the developer to do extra work. + + * Custom dependencies. A developer may need to prepare a custom version of his application's dependency. Some packaging systems can handle custom versions of a dependency, + others can't - and all of them handle it differently. -* Logging: the standard streams (stdout/stderr/stdin) of each process container are collected and logged for real-time or batch retrieval. -* Change management: changes to a container's filesystem can be committed into a new image and re-used to create more containers. No templating or manual configuration required. +Docker solves dependency hell by giving the developer a simple way to express *all* his application's dependencies in one place, +and streamline the process of assembling them. If this makes you think of [XKCD 927](http://xkcd.com/927/), don't worry. Docker doesn't +*replace* your favorite packaging systems. It simply orchestrates their use in a simple and repeatable way. How does it do that? With layers. + +Docker defines a build as running a sequence unix commands, one after the other, in the same container. Build commands modify the contents of the container +(usually by installing new files on the filesystem), the next command modifies it some more, etc. Since each build command inherits the result of the previous +commands, the *order* in which the commands are executed expresses *dependencies*. + +Here's a typical docker build process: + +```bash +from ubuntu:12.10 +run apt-get update +run apt-get install python +run apt-get install python-pip +run pip install django +run apt-get install curl +run curl http://github.com/shykes/helloflask/helloflask/master.tar.gz | tar -zxv +run cd master && pip install -r requirements.txt +``` + +Note that Docker doesn't care *how* dependencies are built - as long as they can be built by running a unix command in a container. -* Interactive shell: docker can allocate a pseudo-tty and attach to the standard input of any container, for example to run a throwaway interactive shell. Install instructions ==================