--- layout: post title: "Capistrano Version 3 Release Announcement" date: 2013-06-01 00:00:00 --- After what seems like years of work, the Capistrano team (that's Tom and I) are pleased to announce the first *major* release of Capistrano in almost 5 years. The reasons behind the length of time between the last architectural overhaul and this one are numerous, but it can be summrised to say that Capistrano is a widely used tool, and when working around software deployment it's really a question of downtime. If we had changed something significant in Capistrano we could have taken a lot of sites offline, and made a lot of people very unhappy. Until this point we haven't felt that the time has been ripe where the benefits of a slightly rocky upgrade path are worth the risks of downtime. It also hasn't helped historically that we've only just gotten to grips with Ruby 1.9, and that Bundler's near ubuquity means that now it's trivial to lock a Gem at a specific version. With other tools in the Ruby eco system it's become easier for us to make significant changes to a tool upon which many hundreds of thousands of people rely. ### Design Goals We had a few goals for this release, in no particular order they were: * **Get away from our own DSL solution.** Great DSL alternatives (Rake, Sake, Thor, etc) are already widely used. * **Better modularisation.** to enable people outside the Rails community to benefit from Capistrano's *best-practice* workflow, and to enable people in the Rails community to pick and choose support for components they use (Database Migrations, Asset Pipeline, etc) * **Easier Debugging.** A lot of problems with Capistrano come from weirdness surrounding environmental issues around PTY vs non-TTY environments, login and non-login shells not to mention *environment managers* such as rvm, rbenv and nvm. * **Speed.** We know that in a lot of environments speed of deployment is a huge factor, since Rails introduced the *Asset Pipeline* it's not uncommon for a deploy that formerly took 5 seconds now takes 5 minutes. This really is mostly out of our control, but with improved support for parallelism, rolling restarts we feel confident that things will be quicker and easier to keep running quickly now. * **Applicability.** We've always maintained that Capistrano is a terrible tool for system provisioning, and that more often than not servers are better being setup with Chef, Puppet or similar, whilst we still agree with that, the new features in Capistrano really lend themselves to integrating with these kinds of tools. ### What's missing? Before we get too carried away it's worth shortlisting the things that don't exist in version three, ***yet***. * **SSH Gateway Support** SSH Gateway support hasn't been implemented in version three yet, I hope that this will be done soon. As I have no direct need for it, I haven't the means to test it with a view to implementing it, yet. * **Mecurial, Subversion, and CVS Support** These have been removed as we've been able to implement the Git SCM in an incredibly neat way that isn't compatible with the others. We wanted to break the cycle of always sticking with the lowest common denominator, so we are **actively** looking for people who are interested in contributing, or sharing expertise on the *best-practice* way of speedily deploying from your respective choice of source control. * **`HOSTFILTER` ,`ROLEFILTER` and friends** These have gone away because we always felt they were indemic of a bad design desision about using Environmental Variables. These will be coming back as flags passed to `cap` on the CLI, and options that can be set on the Capistrano::Application Ruby class. * **Shell** The shell has been removed temporarily pending a neater implementation, we've got something that we are playing with internally, but it needs better readline support, and some more controls around what to do when things go badly on some servers, but not others. * **Cold Deploy** The `cap deploy:cold` is a really old legacy component, orignally from the days of the `script/spinner` where deploying cold (starting workers that weren't running), and deploying a *warm* system were different (restarting existing worker pools, which wasn't fun!) By and large these things have gone away, and it's time `deploy:cold` went away. It's safe in every case we could find to call setup, and seed and other Rake tasks without things blowing up, and that should be the approach we take. Tasks on the server should be idempotent, and if something is called twice, let it be. ### What's new? Each section here really deserves it's own sub-heading as some of the new features are awesome. #### Rake Integration We have moved away from our own DSL implemenation to implement Capistrano as a *Rake* application. Rake has always supported being sub-classed, so to speak as a *sub-application*; it is however poorly documented. By subclassing `Rake::Application` one can specify what the *Rakefile* should look like, where to search for it, and how to load other *Rakefiles*. The *Rake* DSL is widely used, well known and very powerful. As Rake is essentially a dependency resolution system, it offers a lot of nice ways to, for example build a tarball as a dependency of uploading it and deploying it. This has allowed us to do away with the *copy* strategy all together, as it can now be implemented from scratch in fewer than ten lines of code. You can check out the [replicating the copy strategy](/screencasts/replicating_the_copy_strategy) screencast and acompanying documentation if you want to explore that any further. The guiding principle is dependency resolution, and interoperability with other tools, for example: {% prism ruby %} # Capistrano 3.0.x task :notify do this_release_tag = sh("git describe --abbrev=0 --tags") last_ten_commits = sh("git log #{this_release_tag}~10..#{this_release_tag}") Mail.deliver do to "team@example.com" subject "Releasing #{this_release_tag} Now!" body last_ten_commits end end namespace :deploy task default: :notify end {% endprism %} The last three lines rely on Rake's additive task declaration, by redefining the `deploy:default` task by adding another dependency. Rake will automatically resolve this dependency at Runtime, mailing the recent changelog to your team, assuming everything is setup correctly. #### Built In Stage Support In former versions of Capistrano *stage* support was an after thought, provided through the `capistrano-ext` Gem, and latterly merged into the main codebase, people insisted in still using the `capistrano-ext` version regardless. In Capistrano 3.0.x there's stage support built in, at installation time, two stages will be created by default, *staging* and *production*; it's easy to add more, just add a file to `config/deploy/______.rb` which follows the conventions established in the examples we created for you. To create different stages at installation time, simply set the `STAGES` environmental variable to a comma separated list of stages: {% prism bash %} $ cap install STAGES=staging,production,ci,qa {% endprism %} #### Parallelism In former versions of Capistrano there was a *parallel* option to run different tasks differently on groups of servers, it looked something like this: {% prism ruby %} # Capistrano 2.0.x task :restart do parallel do |session| session.when "in?(:app)", "/u/apps/social/script/restart-mongrel" session.when "in?(:web)", "/u/apps/social/script/restart-apache" session.else "echo nothing to do" end end {% endprism %} This always felt a little unclean, and indeed it's a hack that was originally implemeted to facilitate rolling deployments at a large German firm by a couple of freelancers who were consulting with them. (Hint, one of those guys went on to found Travis-CI!) The equivilent code in under Capistrano v3 would look like this: {% prism ruby %} # Capistrano 3.0.x task :restart do on :all, in: :parallel do |host| if host.roles.include?(:app) execute "/u/apps/social/script/restart-mongrel" elsif host.roles.include?(:web) execute "/u/apps/social/script/restart-web" else info sprintf("Nothing to do for %s with roles %s", host, host.properties.roles) end end end {% endprism %} The second block of code, that representing the new Rake derived DSL and demonstrating how to use the parallel execution mode is a little longer, but I think it's clearer, more idiomatic Ruby code which relies less on an intimate knowledge of how the Capistrano DSL happens to work. It also hints at the built-in logging subsystem, keep reading to learn more. Other modes for parallelism include: {% prism ruby %} # Capistrano 3.0.x on :all, in: :groups, max: 3, wait: 5 do # Take all servers, in groups of three which execute in parallel # wait five seconds between groups of servers. # This is perfect for rolling restarts end on :all, in: :sequence, wait: 15 do # This takes all servers, in sequence and waits 15 seconds between # each server, this might be perfect if you are afraid about # overloading a shared resource, or want to defer the asset compilation # over your cluster owing to worries about load end on :all, in: :parallel do # This will simply try and execute the commands contained within # the block in parallel on all servers. This might be perfect for kicking # off something like a Git checkout or similar. end {% endprism %} The internal tasks, for standard deploy recipes make use of all of these as is appropriate for the normal case, no need to be afraid of scary slow deploys again! #### Streaming IO This IO streaming model means that results from commands, the commands themselves and any other arbitrary output are sent as objects to a class with an `IO`ish interface, the class knows what to do with these things. There's a *progress* formatter which prints dots for each command that is called, as well as a *pretty* formatter which prints the full command, it's output on standard out and standard error, as well as the final return status. It would be trivial to implement HTML formatters, or formatters that reported to your IRC room, or to email. I look forward to seeing more of these cropping up in the community. #### Host Definition Access If you didn't skim over the *Parallism* section above, you might have noticed we did something clever that wasn't possible in Capistrano v2; we accessed the `host` inside the execution block. For a lot of reasons in Capistrano v2 is wasn't possible to do this, the block was essentially evaluated once and called verbatim on each host. This lead to disappointing missing features such as not being able to pull the host list out of Capistrano and examine the roles to do something like controlling chef solo, or similar. In Capistrano v3 the `host` object is the same object that is created when a server is defined, and is internally used, for example to pass to an ERB template for rendering a last-deploy message that is dumped onto each server after a successful deployment. The last deploy log includes everything Capistrano knew about that server during the deployment. > Users of Capistrano v2 may be familiar with the perenial `cap deploy:cleanup` problem which came to light when servers differed in their old releases list, imagine a scenario with two servers, one has been your bread-and-butter since you launched, it has hundreds of old releases from all your wonderful deploys over the months or years. The second server has been in the cluster for about a month, it didn't quite slot-in cleanly, so the list of old releases looks a bit weird, you deleted a few by hand, and anyway there might only be ten-or-so releases there. > Now imagine that you call `cap deploy:cleanup`, old `capture()` implementations silently only ran on the first server that matched the properties defined, so server one returned a list of ~95 old timestamped release directories. Next Capistrano v2 would call `rm -rf release1..release95` on **both** servers, causing server two to err our, and leaving an undefined state on server one, as Capistrano would simply hang up both connections. This cleanup routine can now be better implemented as follows (which is actually more or less the actual implementation in the the new Gem): {% prism ruby %} # Capistrano 3.0.x desc "Cleanup all old releases (keeps #{fetch(:releases_to_keeo_on_cleanup)} old releases" task :cleanup do keep_releases = fetch(:releases_to_keep_on_cleanup) releases = capture(:ls, fetch(:releases_directory)) release_to_delete = releases.sort_by { |r| rn.to_i }.slice(1..-(keep_releases + 1)) releases_to_delete.each do |r| execute :rm, fetch(:releases_directory).join(r) end end {% endprism %} Some handy things to note here are that both server one and server two in our contrived example will both evaluate that independently, and when both servers are finished removing old releases the `task :cleanup` block will have finished. Also in Capistrano v3 most path varaibles are [`Pathname`] objects, so they natively respond to things like `#basename`, `#expand_path`, `#join` and similar. **Warning:** `#expand_path` probably won't do what you expect, it will execute on your *workstation* machine, and not on the remote host, so it's possible that it will return an error in the case of paths which exist remotely but not locally. #### Host Properties As the `host` object is now available to the task blocks, it made sense to make it possible to store arbitrarty values against them. Enter `host.properties`. This is a simple [*OpenStruct*](http://www.ruby-doc.org/stdlib-2.0/libdoc/ostruct/rdoc/OpenStruct.html) which can be used to store any additional properties which are important for your application. An example of it's usage might be: {% prism ruby %} h = SSHKit::Host.new 'example.com' h.properties.roles ||= %i{wep app} {% endprism %} #### More Expressive Command Language In Capistrano v2, it wasn't uncommon to find commands such as: {% prism ruby %} # Capistrano 2.0.x task :precompile, :roles => lambda { assets_role }, :except => { :no_release => true } do run <<-CMD.compact cd -- #{latest_release} && RAILS_ENV=#{rails_env.to_s.shellescape} #{asset_env} #{rake} assets:precompile CMD end {% endprism %} In Capistrano v3 this looks more like this: {% prism ruby %} # Capistrano 3.0.x task :precompile do on :sprockets_asset_host, reject: lambda { |h| h.properties.no_release } do within fetch(:latest_release_directory) with rails_env: fetch(:rails_env) do execute :rake, 'assets:precompile' end end end end {% endprism %} Again, with other examples this format is a little longer, but much more expressive, and all the nightmare of shell escaping is handled interally for you, environmental variables are capitalised and applied at the correct point (i.e between the `cd` and `rake` calls in this case). Other options here include `as :a_user` and #### Better *magic* Variable Support In Capistrano v2 there were certain bits of magic where if calling a variable and NoMethodError would have been raised (for example the `latest_release_directory` varaible). This variable never existed on the global namespace, as a fall-back the list of `set()` variables would be consulted. This magic lead to times when people were not recognising that magic variables were even being used. The magic variable system of Capistrano v2 did also include a way to `fetch(:some_variable, 'with a default value')` incase the variable might not be set already, but it wasn't widely used, and more often than not people just used things like `latest_release_directory` never knowing that behind the scenes an exception was raised, then rescued, and that `:latest_release_directory` in the variable map was actually a continuation that was evaluated the first time it was used, and the value then cached until the end of the script. The system has now 100% less magic. If you set a variable using `set()`, it can be fetched with `fetch()`, if the value you set into the variable responds to `#call` then it will be executed in the current context whenever it is used, the values will not be cached, unless your continuation does some explicit caching. *Again, we are favoring clarity over micro optimisation*. #### SSHKit Many of the new features in Capistrano whch relate to logging, formatting, SSH, connection management and pooling, parallism, batch execution and more are from a library that fell out of the Capistrano v3 development process. [*SSHKit*][] is a lower level toolkit, a level higher than *Net::SSH*, still but lacking the roles, environments, rollbacks and other higher level features from Capistrano. SSHkit is ideal for use if you need to just connect to a machine and run some arbitrary command, for example: {% prism ruby %} # Rakefile (even without Capistrano loaded) require 'sshkit' desc "Check the uptime of example.com" task :uptime do |h| execute :uptime end {% endprism %} There is much more than can be done with SSHKit, and we have quite an extensive [list of examples](https://github.com/leehambley/sshkit/blob/master/EXAMPLES.md). For the most part with Capistrano v3, anything that happens inside of an `on()` block is happening in SSHkit, and the documentation from that library is the place to go to find more information. #### Command Mapping This is another feature from SSHKit, designed to remove a little ambiguity from preceedings, there is a so-called command map for commands. When executing something like: {% prism ruby %} # Capistrano 2.0.x execute "git clone ........ ......." {% endprism %} The command is passed through to the remote server *completely unchanged*. This includes the options which might be set, such as user, directory, and environmental variables. **This is by design.** This feature is designed to allow people to write non-trivial commands in [heredocs](https://en.wikipedia.org/wiki/Here_document) when the need arises, for example: {% prism ruby %} # Capistrano 3.0.x execute <<-EOBLOCK # All of this block is interpreted as Bash script if ! [ -e /tmp/somefile ]; then touch /tmp/somefile end EOBLOCK {% endprism %} The idiomatic way to write that command in Capistrano v3 is to use the separated variadaric method to specify the command: {% prism ruby %} # Capistrano 3.0.x execute :git, :clone, "........", "......." {% endprism %} ... or for the larger example {% prism ruby %} # Capistrano 3.0.x file = '/tmp/somefile' unless test("-e #{file}") execute :touch, file end {% endprism %} In this way the *command map* is consulted, the command map maps all unknown commands (which in this case is `git`, the rest of the line are *arguments* to `git` ) are mapped to `/usr/bin/env ...`. Meaning that this command would be expanded to `/usr/bin/env git clone ...... ......` which is what happens when `git` is called without a full path, the `env` program is consulted (perhaps indirectly) to determine which `git` to run. Commands such as `rake` and `rails` are often better prefixed by `bundle exec`, and in this case could be mapped to: {% prism ruby %} SSHKit.config.command_map[:rake] = "bundle exec rake" SSHKit.config.command_map[:rails] = "bundle exec rails" {% endprism %} There can also be a lamda or Proc applied in place of the mapping like so: {% prism ruby %} SSHKit.config.command_map = Hash.new do |hash, key| if %i{rails rake bundle clockwork heroku}.include?(key.to_sym) hash[key] = "/usr/bin/env bundle exec #{key}" else hash[key] = "/usr/bin/env #{key}" end end {% endprism %} Between these two options there should be quite powerful options to map commands in your environment without having to override internal tasks from Capistrano just because a path is different, or a binary has a different name. This can also be *slightly* abused in environments where *shim* executables are used, for example `rbenv` *wrappers*: {% prism ruby %} SSHKit.config.command_map = Hash.new do |hash, key| if %i{rails rake bundle clockwork heroku}.include?(key.to_sym) hash[key] = "/usr/bin/env myproject_bundle exec myproject_#{key}" else hash[key] = "/usr/bin/env #{key}" end end {% endprism %} The above assumes that you have done something like `rbenv wrapper default myproject` which creates wrapper binaries which correctly set up the Ruby environment without requiring an interactive login shell. #### Testing The old test suite for Capistrano was purely unit tests, and didn't cover a wide varity of problem cases, specifically nothing in the `deploy.rb` (that is the actual *deployment* code) was tested at all; because of having our own DSL implementation, and other slightly odd design points, it was painful to test the actual *recipes*. Testing has been a focus of Capistrano v3. The *integration* test suite uses Vagrant to boot a machine, configures certain scenarios using portable shell script, and then executes commands against them, deploying common configurations to typical Linux systems. This is slow to execute, but offers stronger guarantees that nothing is broken that we've ever been able to give before. Capistrano v3 also offers a possibility to swap out backend implementations. This is interesting because for the purpose of testing your *own* recipes you can use a *printer* backend, and verify that the output matched what you expected, or use a stubbed backend upon which you can verify that calls were made, or not made as expected. #### Arbitrary Logging Capistrano exposes the methods `debug()`, `info()`, `warn()`, `error()` and `fatal()` inside of `on()` blocks which can be used to log using the existing logging infrastructure and streaming IO formatters: {% prism ruby %} # Capistrano 3.0.x on hosts do |host| f = '/some/file' if test("[ -d #{f} ]") execute :touch, f else info "#{f} already exists on #{host}!" end end {% endprism %} ### Upgrading The best place to go here is the [upgrading documentation][] to get deeper into the specifics. The simple version is to say that there is *no **direct** upgrade path*, versions two and three are incompatible. This is partly by design, the old DSL was imprecise in places that would have made doing the right thing in most cases tricky, we opted to invest in more features and better reliability than investing in keeping a backwards compatible API. There are a number of *gotchas* listed below, but the main points are the new names of the built-in roles, as well as that by default Capistrano v3 is platform agnostic, if you need Rails support, for migrations, asset pipeline and such like, then it's required to `require` the support files. ### Gotchas #### Rake DSL Is Additive In Capistrano v2 if you re-define a task then it replaces the original implemetation, this has been used by people to replace internal tasks piecemeal with their own implementations #### `sudo` Behaviour