Document multiple databases in Rails

This file documents how to use multiple databases, what features are supported, what features are coming soon, and caveats.
2019-06-03 11:24:13 -04:00 · 2019-06-03 11:24:13 -04:00 · 8574c3e4fa
parent c926ca4628
commit 8574c3e4fa
2 changed files with 274 additions and 0 deletions
--- a/guides/source/active_record_multiple_databases.md
+++ b/guides/source/active_record_multiple_databases.md
@ -0,0 +1,269 @@
+**DO NOT READ THIS FILE ON GITHUB, GUIDES ARE PUBLISHED ON https://guides.rubyonrails.org.**
+
+Multiple Databases with Active Record
+=====================================
+
+This guide covers using multiple databases with your Rails application.
+
+After reading this guide you will know:
+
+* How to setup your application for multiple databases.
+* How automatic connection switching works.
+* What features are supported and what's still a work in progress.
+
+--------------------------------------------------------------------------------
+
+As an application grows in popularity and usage you'll need to scale the application
+to support your new users and their data. One way in which your application may need
+to scale is on the database level. Rails now has support for multiple databases
+so you don't have to store your data all in one place.
+
+At this time the following features are supported:
+
+* Multiple primary databases and a replica for each
+* Automatic connection switching for the model you're working with
+* Automatic swapping between the primary and replica depending on the HTTP verb
+and recent writes
+* Rails tasks for creating, dropping, migrating, and interacting with the multiple
+databases
+
+The following features are not (yet) supported:
+
+* Sharding
+* Joining across clusters
+* Load balancing replicas
+
+## Setting up your application
+
+While Rails tries to do most of the work for you there are still some steps you'll
+need to do to get your application ready for multiple databases.
+
+Let's say we have an application with a single primary database and we need to add a
+new database for some new tables we're adding. The name of the new database will be
+"animals".
+
+The database.yml looks like this:
+
+```yaml
+production:
+  database: my_primary_database
+  user: root
+  adapter: mysql
+```
+
+Let's add a replica for the primary, a new writer called animals and a replica for that
+as well. To do this we need to change our database.yml from a 2-tier to a 3-tier config.
+
+```yaml
+production:
+  primary:
+    database: my_primary_database
+    user: root
+    adapter: mysql
+  primary_replica:
+    database: my_primary_database
+    user: root_readonly
+    adapter: mysql
+    replica: true
+  animals:
+    database: my_animals_database
+    user: animals_root
+    adapter: mysql
+    migrations_paths: db/animals_migrate
+  animals_replica:
+    database: my_animals_database
+    user: animals_readonly
+    adapter: mysql
+    replica: true
+```
+
+When using multiple databases there are a few important settings.
+
+First, the database name for the primary and replica should be the same because they contain
+the same data. Second, the username for the primary and replica should be different, and the
+replica user's permissions should be to to read and not write.
+
+When using a replica database you need to add a `replica: true` entry to the replica in the
+`database.yml`. This is because Rails otherwise has no way of knowing which one is a replica
+and which one is the primary.
+
+Lastly, for new primary databases you need to set the `migrations_paths` to the directory
+where you will store migrations for that database. We'll look more at `migrations_paths`
+later on in this guide.
+
+Now that we have a new database, let's set up the model. In order to use the new database we
+need to create a new abstract class and connect to the animals databases.
+
+```ruby
+class AnimalsBase < ApplicationRecord
+  self.abstract_class = true
+
+  connects_to database: { writing: :animals, reading: :animals_replica }
+end
+```
+ Then we need to
+update `ApplicationRecord` to be aware of our new replica.
+
+```ruby
+class ApplicationRecord < ActiveRecord::Base
+  self.abstract_class = true
+
+  connects_to database: { writing: :primary, reading: :primary_replica }
+end
+```
+
+By default Rails expects the database roles to be `writing` and `reading` for the primary
+and replica respectively. If you have a legacy system you may already have roles set up that
+you don't want to change. In that case you can set a new role name in your application config.
+
+```ruby
+config.active_record.writing_role = :default
+config.active_record.reading_role = :readonly
+```
+
+Now that we have the database.yml and the new model set up it's time to create the databases.
+Rails 6.0 ships with all the rails tasks you need to use multiple databases in Rails.
+
+You can run `rails -T` to see all the commands you're able to run. You should see the following:
+
+```
+$ rails -T
+rails db:create                          # Creates the database from DATABASE_URL or config/database.yml for the ...
+rails db:create:animals                  # Create animals database for current environment
+rails db:create:primary                  # Create primary database for current environment
+rails db:drop                            # Drops the database from DATABASE_URL or config/database.yml for the cu...
+rails db:drop:animals                    # Drop animals database for current environment
+rails db:drop:primary                    # Drop primary database for current environment
+rails db:migrate                         # Migrate the database (options: VERSION=x, VERBOSE=false, SCOPE=blog)
+rails db:migrate:animals                 # Migrate animals database for current environment
+rails db:migrate:primary                 # Migrate primary database for current environment
+rails db:migrate:status                  # Display status of migrations
+rails db:migrate:status:animals          # Display status of migrations for animals database
+rails db:migrate:status:primary          # Display status of migrations for primary database
+```
+
+Running a command like `rails db:create` will create both the primary and animals databases.
+Note that there is no command for creating the users and you'll need to do that manually
+to support the readonly users for your replicas. If you want to create just the animals
+database you can run `rails db:create:animals`.
+
+## Migrations
+
+Migrations for multiple databases should live in their own folders prefixed with the
+name of the database key in the configuration.
+
+You also need to set the `migrations_paths` in the database configurations to tell Rails
+where to find the migrations.
+
+For example the `animals` database would look in the `db/animals_migrate` directory and
+`primary` would look in `db/migrate`. Rails generators now take a `--database` option
+so that the file is generated in the correct directory. The command can be run like so:
+
+```
+$ rails g migration CreateDogs name:string --database animals
+```
+
+## Activating automatic connection switching
+
+Finally, in order to use the read-only replica in your application you'll need to activate
+the middleware for automatic switching.
+
+Automatic switching allows the application to switch from the primary to replica or replica
+to primary based on the HTTP verb and whether there was a recent write.
+
+If the application is receiving a POST, PUT, DELETE, or PATCH request the application will
+automatically write to the primary. For the specified time after the write the application
+will read from the replica. For a GET or HEAD request the application will read from the
+replica unless there was a recent write.
+
+To activate the automatic connection switching middleware, add or uncomment the following
+lines in your application config.
+
+```ruby
+config.active_record.database_selector = { delay: 2.seconds }
+config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver
+config.active_record.database_resolver_context = ActiveRecord::Middleware::DatabaseSelector::Resolver::Session
+```
+
+Rails guarantees "read your own write" and will send your GET or HEAD request to the
+primary if it's within the `delay` window. By default the delay is set to 2 seconds. You
+should change this based on your database infrastructure. Rails doesn't guarantee "read
+a recent write" for other users within the delay window and will send GET and HEAD requests
+to the replicas unless they wrote recently.
+
+The automatic connection switching in Rails is relatively primitive and deliberatly doesn't
+do a whole lot. The goal was a system that demonstrated how to do automatic connection
+switching that was flexible enough to be customizable by app developers.
+
+The setup in Rails allows you to easily change how the switching is done and what
+parameters it's based on. Let's say you want to use a cookie instead of a session to
+decide when to swap connections. You can write your own class:
+
+```ruby
+class MyCookieResolver
+  # code for your cookie class
+end
+```
+
+And then pass it to the middleware:
+
+```ruby
+config.active_record.database_selector = { delay: 2.seconds }
+config.active_record.database_resolver = ActiveRecord::Middleware::DatabaseSelector::Resolver
+config.active_record.database_resolver_context = MyCookieResovler
+```
+
+## Using manual connection switching
+
+There are some cases where you may want your application to connect to a primary or a replica
+and the automatic connection switching isn't adequate. For example, you may know that for a
+particular request you always want to send the request to a replica, even when you are in a
+POST request path.
+
+To do this Rails provides a `connected_to` method that will switch to the connection you
+need.
+
+```ruby
+ActiveRecord::Base.connected_to(role: :reading) do
+  # all code in this block will be connected to the reading role
+end
+```
+
+The "role" in the `connected_to` call looks up the connections that are connected on that
+connection handler (or role). The `reading` connection handler will hold all the connections
+that were connected via `connects_to` with the role name of `reading`.
+
+There also may be a case where you have a database that you don't always want to connect to
+on application boot but may need for a slow query or analytics. After defining that database
+in the database.yml you can connect by passing a database argument to `connected_to`
+
+```ruby
+ActiveRecord::Base.connected_to(database: { reading_slow: :animals_slow_replica }) do
+  # do something while connected to the slow replica
+end
+```
+
+The `database` argument for `connected_to` will take a symbol or a config hash.
+
+Note that `connected_to` with a role will look up an existing connection and switch
+using the connection specification name. This means that if you pass an unknown role
+like `connected_to(role: :nonexistent)` you will get an error like that says
+`ActiveRecord::ConnectionNotEstablished (No connection pool with 'AnimalsBase' found
+for the 'nonexistent' role.)`
+
+## Caveats
+
+As noted at the top Rails doesn't (yet) support sharding. We had to do a lot of work
+to support multiple databases for Rails 6.0. The lack of support for sharding isn't
+an oversight, but does require additional work that didn't make it in for 6.0. For now
+if you need sharding it may be advisable to continue using one of the many gems
+that supports this.
+
+Rails also doesn't support automatic load balancing of replicas. This is very
+depentent on your infrastructure. We may implement basic, primitive load balancing
+in the future, but for an application at scale this should be something your application
+handles outside of Rails.
+
+Lastly, you cannot join across databases. Rails 6.1 will support using `has_many`
+relationships and creating 2 queries instead of joining, but Rails 6.0 will require
+you to split the joins into 2 selects manually.
--- a/guides/source/documents.yaml
+++ b/guides/source/documents.yaml
@ -160,6 +160,11 @@
      work_in_progress: true
      url: active_record_postgresql.html
      description: This guide covers PostgreSQL specific usage of Active Record.
+    -
+      name: Multiple Databases with Active Record
+      work_in_progress: true
+      url: active_record_multiple_databases.html
+      description: This guide covers using multiple databases in your application.

 -
  name: Extending Rails