First draft of new README. Feedback and contributions welcome!

2022-11-09 12:21:53 -05:00 · 2013-05-04 19:47:57 -07:00 · 2013-05-04 19:47:57 -07:00 · 1561232261
commit 1561232261
parent e392b7ee9b
1 changed files with 73 additions and 18 deletions
--- a/README.md
+++ b/README.md
@ -1,37 +1,92 @@
-Docker: the Linux container runtime
-===================================
+Docker: the Linux container engine
+==================================

-Docker complements LXC with a high-level API which operates at the process level. It runs unix processes with strong guarantees of isolation and repeatability across servers.
+Docker is an open-source engine which automates the deployment of applications as highly portable, self-sufficient containers.

-Docker is a great building block for automating distributed systems: large-scale web deployments, database clusters, continuous deployment systems, private PaaS, service-oriented architectures, etc.
+Docker containers are both *hardware-agnostic* and *platform-agnostic*. This means that they can run anywhere, from your
+laptop to the largest EC2 compute instance and everything in between - and they don't require that you use a particular
+language, framework or packaging system. That makes them great building blocks for deploying and scaling web apps, databases
+and backend services without depending on a particular stack or provider.

-![Docker L](docs/sources/static_files/lego_docker.jpg "Docker")
+Docker is an open-source implementation of the deployment engine which powers [dotCloud](http://dotcloud.com), a popular Platform-as-a-Service.
+It benefits directly from the experience accumulated over several years of large-scale operation and support of hundreds of thousands
+of applications and databases.

-* *Heterogeneous payloads*: any combination of binaries, libraries, configuration files, scripts, virtualenvs, jars, gems, tarballs, you name it. No more juggling between domain-specific tools. Docker can deploy and run them all.
+## Better than VMs

-* *Any server*: docker can run on any x64 machine with a modern linux kernel - whether it's a laptop, a bare metal server or a VM. This makes it perfect for multi-cloud deployments.
+A common method for distributing applications and sandbox their execution is to use virtual machines, or VMs. Typical VM formats
+are VMWare's vmdk, Oracle Virtualbox's vdi, and Amazon EC2's ami. In theory these formats should allow every developer to
+automatically package their application into a "machine" for easy distribution and deployment. In practice, that almost never
+happens, for a few reasons:

-* *Isolation*: docker isolates processes from each other and from the underlying host, using lightweight containers.
+	* *Size*: VMs are very large which makes them impractical to store and transfer.
+	* *Performance*: running VMs consumes significant CPU and memory, which makes them impractical in many scenarios, for example local development of multi-tier applications, and
+		large-scale deployment of cpu and memory-intensive applications on large numbers of machines.
+	* *Portability*: competing VM environments don't play well with each other. Although conversion tools do exist, they are limited and add even more overhead.
+	* *Hardware-centric*: VMs were designed with machine operators in mind, not software developers. As a result, they offer very limited tooling for what developers need most:
+		building, testing and running their software. For example, VMs offer no facilities for application versioning, monitoring, configuration, logging or service discovery.

-* *Repeatability*: because containers are isolated in their own filesystem, they behave the same regardless of where, when, and alongside what they run.
+By contrast, Docker relies on a different sandboxing method known as *containerization*. Unlike traditional virtualization,
+containerization takes place at the kernel level. Most modern operating system kernels now support the primitives necessary
+for containerization, including Linux with [openvz](http://openvz.org), [vserver](http://linux-vserver.org) and more recently [lxc](http://lxc.sourceforge.net),
+	Solaris with [zones](http://docs.oracle.com/cd/E26502_01/html/E29024/preface-1.html#scrolltoc) and FreeBSD with [Jails](http://www.freebsd.org/doc/handbook/jails.html).
+
+Docker builds on top of these low-level primitives to offer developers a portable format and runtime environment that solves
+all 4 problems. Docker containers are small (and their transfer can be optimized with layers), they have basically zero memory and cpu overhead,
+the are completely portable and are designed from the ground up with an application-centric design.
+
+The best part: because docker operates at the OS level, it can still be run inside a VM!
+
+## Plays well with others
+
+Docker does not require that you buy into a particular programming language, framework, packaging system or configuration language.
+
+Is your application a unix process? Does it use files, tcp connections, environment variables, standard unix streams and command-line
+arguments as inputs and outputs? Then docker can run it.
+
+Can your application's build be expressed a sequence of such commands? Then docker can build it.


-Notable features
-----------------
+## Escape dependency hell

-* Filesystem isolation: each process container runs in a completely separate root filesystem.
+A common problem for developers is the difficulty of managing all their application's dependencies in a simple and automated way.

-* Resource isolation: system resources like cpu and memory can be allocated differently to each process container, using cgroups.
+This is usually difficult for several reasons:

-* Network isolation: each process container runs in its own network namespace, with a virtual interface and IP address of its own.
+  * *Cross-platform dependencies*. Modern applications often depend on a combination of system libraries and binaries, language-specific packages, framework-specific modules,
+  	internal components developed for another project, etc. These dependencies live in different "worlds" and require different tools - these tools typically don't work
+	well with each other, requiring awkward custom integrations.

-* Copy-on-write: root filesystems are created using copy-on-write, which makes deployment extremely fast, memory-cheap and disk-cheap.
+  * Conflicting dependencies. Different applications may depend on different versions of the same dependency. Packaging tools handle these situations with various degrees of ease -
+  	but they all handle them in different and incompatible ways, which again forces the developer to do extra work.
+  
+  * Custom dependencies. A developer may need to prepare a custom version of his application's dependency. Some packaging systems can handle custom versions of a dependency,
+  	others can't - and all of them handle it differently.

-* Logging: the standard streams (stdout/stderr/stdin) of each process container are collected and logged for real-time or batch retrieval.

-* Change management: changes to a container's filesystem can be committed into a new image and re-used to create more containers. No templating or manual configuration required.
+Docker solves dependency hell by giving the developer a simple way to express *all* his application's dependencies in one place,
+and streamline the process of assembling them. If this makes you think of [XKCD 927](http://xkcd.com/927/), don't worry. Docker doesn't
+*replace* your favorite packaging systems. It simply orchestrates their use in a simple and repeatable way. How does it do that? With layers.
+
+Docker defines a build as running a sequence unix commands, one after the other, in the same container. Build commands modify the contents of the container
+(usually by installing new files on the filesystem), the next command modifies it some more, etc. Since each build command inherits the result of the previous
+commands, the *order* in which the commands are executed expresses *dependencies*.
+
+Here's a typical docker build process:
+
+```bash
+from	ubuntu:12.10
+run	apt-get update
+run	apt-get install python
+run	apt-get install python-pip
+run	pip install django
+run	apt-get install curl
+run	curl http://github.com/shykes/helloflask/helloflask/master.tar.gz | tar -zxv
+run	cd master && pip install -r requirements.txt
+```
+
+Note that Docker doesn't care *how* dependencies are built - as long as they can be built by running a unix command in a container.

-* Interactive shell: docker can allocate a pseudo-tty and attach to the standard input of any container, for example to run a throwaway interactive shell.

 Install instructions
 ==================