From 6cc03675d30b58e28f585720dad14e947a57ff5b Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 1 Jan 2014 17:33:59 -0500 Subject: [PATCH] Ensure Active Record connection consistency Currently Active Record can be configured via the environment variable `DATABASE_URL` or by manually injecting a hash of values which is what Rails does, reading in `database.yml` and setting Active Record appropriately. Active Record expects to be able to use `DATABASE_URL` without the use of Rails, and we cannot rip out this functionality without deprecating. This presents a problem though when both config is set, and a `DATABASE_URL` is present. Currently the `DATABASE_URL` should "win" and none of the values in `database.yml` are used. This is somewhat unexpected to me if I were to set values such as `pool` in the `production:` group of `database.yml` they are ignored. There are many ways that active record initiates a connection today: - Stand Alone (without rails) - `rake db:` - ActiveRecord.establish_connection - With Rails - `rake db:` - `rails | ` - `rails dbconsole` We should make all of these behave exactly the same way. The best way to do this is to put all of this logic in one place so it is guaranteed to be used. Here is my prosed matrix of how this behavior should work: ``` No database.yml No DATABASE_URL => Error ``` ``` database.yml present No DATABASE_URL => Use database.yml configuration ``` ``` No database.yml DATABASE_URL present => use DATABASE_URL configuration ``` ``` database.yml present DATABASE_URL present => Merged into `url` sub key. If both specify `url` sub key, the `database.yml` `url` sub key "wins". If other paramaters `adapter` or `database` are specified in YAML, they are discarded as the `url` sub key "wins". ``` ### Implementation Current implementation uses `ActiveRecord::Base.configurations` to resolve and merge all connection information before returning. This is achieved through a utility class: `ActiveRecord::ConnectionHandling::MergeAndResolveDefaultUrlConfig`. To understand the exact behavior of this class, it is best to review the behavior in activerecord/test/cases/connection_adapters/connection_handler_test.rb though it should match the above proposal. --- activerecord/CHANGELOG.md | 61 +++++++++ .../connection_specification.rb | 15 ++- .../lib/active_record/connection_handling.rb | 60 ++++++++- activerecord/lib/active_record/core.rb | 9 +- activerecord/lib/active_record/railtie.rb | 16 +-- .../lib/active_record/railties/databases.rake | 2 +- .../connection_handler_test.rb | 127 ++++++++++++++++++ guides/source/configuring.md | 119 +++++++++++++++- .../lib/rails/application/configuration.rb | 16 ++- railties/lib/rails/commands/dbconsole.rb | 15 ++- railties/test/commands/dbconsole_test.rb | 2 +- 11 files changed, 409 insertions(+), 33 deletions(-) diff --git a/activerecord/CHANGELOG.md b/activerecord/CHANGELOG.md index bd7dcb6167..d9f8ee7097 100644 --- a/activerecord/CHANGELOG.md +++ b/activerecord/CHANGELOG.md @@ -1,3 +1,64 @@ +* Currently Active Record can be configured via the environment variable + `DATABASE_URL` or by manually injecting a hash of values which is what Rails does, + reading in `database.yml` and setting Active Record appropriately. Active Record + expects to be able to use `DATABASE_URL` without the use of Rails, and we cannot + rip out this functionality without deprecating. This presents a problem though + when both config is set, and a `DATABASE_URL` is present. Currently the + `DATABASE_URL` should "win" and none of the values in `database.yml` are + used. This is somewhat unexpected, if one were to set values such as + `pool` in the `production:` group of `database.yml` they are ignored. + + There are many ways that Active Record initiates a connection today: + + - Stand Alone (without rails) + - `rake db:` + - `ActiveRecord.establish_connection` + + - With Rails + - `rake db:` + - `rails | ` + - `rails dbconsole` + + Now all of these behave exactly the same way. The best way to do + this is to put all of this logic in one place so it is guaranteed to be used. + + Here is the matrix of how this behavior works: + + ``` + No database.yml + No DATABASE_URL + => Error + ``` + + ``` + database.yml present + No DATABASE_URL + => Use database.yml configuration + ``` + + ``` + No database.yml + DATABASE_URL present + => use DATABASE_URL configuration + ``` + + ``` + database.yml present + DATABASE_URL present + => Merged into `url` sub key. If both specify `url` sub key, the `database.yml` `url` + sub key "wins". If other paramaters `adapter` or `database` are specified in YAML, + they are discarded as the `url` sub key "wins". + ``` + + Current implementation uses `ActiveRecord::Base.configurations` to resolve and merge + all connection information before returning. This is achieved through a utility + class: `ActiveRecord::ConnectionHandling::MergeAndResolveDefaultUrlConfig`. + + To understand the exact behavior of this class, it is best to review the + behavior in `activerecord/test/cases/connection_adapters/connection_handler_test.rb` + + *Richard Schneeman* + * Make `change_column_null` revertable. Fixes #13576. *Yves Senn*, *Nishant Modak*, *Prathamesh Sonpatki* diff --git a/activerecord/lib/active_record/connection_adapters/connection_specification.rb b/activerecord/lib/active_record/connection_adapters/connection_specification.rb index 9f210c5f33..3f8b14bf67 100644 --- a/activerecord/lib/active_record/connection_adapters/connection_specification.rb +++ b/activerecord/lib/active_record/connection_adapters/connection_specification.rb @@ -123,13 +123,22 @@ module ActiveRecord def resolve(config) if config resolve_connection config - elsif defined?(Rails.env) - resolve_env_connection Rails.env.to_sym + elsif env = ActiveRecord::ConnectionHandling::RAILS_ENV.call + resolve_env_connection env.to_sym else raise AdapterNotSpecified end end + # Expands each key in @configurations hash into fully resolved hash + def resolve_all + config = configurations.dup + config.each do |key, value| + config[key] = resolve(value) if value + end + config + end + # Returns an instance of ConnectionSpecification for a given adapter. # Accepts a hash one layer deep that contains all connection information. # @@ -219,7 +228,7 @@ module ActiveRecord elsif spec.is_a?(String) resolve_string_connection(spec) else - raise(AdapterNotSpecified, "#{spec} database is not configured") + raise(AdapterNotSpecified, "'#{spec}' database is not configured. Available configuration: #{configurations.inspect}") end end diff --git a/activerecord/lib/active_record/connection_handling.rb b/activerecord/lib/active_record/connection_handling.rb index c4afadbd9b..11f6a47158 100644 --- a/activerecord/lib/active_record/connection_handling.rb +++ b/activerecord/lib/active_record/connection_handling.rb @@ -1,5 +1,8 @@ module ActiveRecord module ConnectionHandling + RAILS_ENV = -> { Rails.env if defined?(Rails) } + DEFAULT_ENV = -> { RAILS_ENV.call || "default_env" } + # Establishes the connection to the database. Accepts a hash as input where # the :adapter key must be specified with the name of a database adapter (in lower-case) # example for regular databases (MySQL, Postgresql, etc): @@ -41,9 +44,10 @@ module ActiveRecord # # The exceptions AdapterNotSpecified, AdapterNotFound and ArgumentError # may be returned on an error. - def establish_connection(spec = ENV["DATABASE_URL"]) - resolver = ConnectionAdapters::ConnectionSpecification::Resolver.new configurations - spec = resolver.spec(spec) + def establish_connection(spec = nil) + spec ||= DEFAULT_ENV.call.to_sym + resolver = ConnectionAdapters::ConnectionSpecification::Resolver.new configurations + spec = resolver.spec(spec) unless respond_to?(spec.adapter_method) raise AdapterNotFound, "database configuration specifies nonexistent #{spec.config[:adapter]} adapter" @@ -53,6 +57,56 @@ module ActiveRecord connection_handler.establish_connection self, spec end + class MergeAndResolveDefaultUrlConfig # :nodoc: + def initialize(raw_configurations, url = ENV['DATABASE_URL']) + @raw_config = raw_configurations.dup + @url = url + end + + # Returns fully resolved connection hashes. + # Merges connection information from `ENV['DATABASE_URL']` if available. + def resolve + ConnectionAdapters::ConnectionSpecification::Resolver.new(config).resolve_all + end + + private + def config + if @url + raw_merged_into_default + else + @raw_config + end + end + + def raw_merged_into_default + default = default_url_hash + + @raw_config.each do |env, values| + default[env] = values || {} + default[env].merge!("url" => @url) { |h, v1, v2| v1 || v2 } if default[env].is_a?(Hash) + end + default + end + + # When the raw configuration is not present and ENV['DATABASE_URL'] + # is available we return a hash with the connection information in + # the connection URL. This hash responds to any string key with + # resolved connection information. + def default_url_hash + if @raw_config.blank? + Hash.new do |hash, key| + hash[key] = if key.is_a? String + ActiveRecord::ConnectionAdapters::ConnectionSpecification::ConnectionUrlResolver.new(@url).to_hash + else + nil + end + end + else + {} + end + end + end + # Returns the connection currently associated with the class. This can # also be used to "borrow" the connection to do database work unrelated # to any of the specific Active Records. diff --git a/activerecord/lib/active_record/core.rb b/activerecord/lib/active_record/core.rb index 18ee77f6fe..cd8690d500 100644 --- a/activerecord/lib/active_record/core.rb +++ b/activerecord/lib/active_record/core.rb @@ -42,9 +42,16 @@ module ActiveRecord # 'database' => 'db/production.sqlite3' # } # } - mattr_accessor :configurations, instance_writer: false + def self.configurations=(config) + @@configurations = ActiveRecord::ConnectionHandling::MergeAndResolveDefaultUrlConfig.new(config).resolve + end self.configurations = {} + # Returns fully resolved configurations hash + def self.configurations + @@configurations + end + ## # :singleton-method: # Determines whether to use Time.utc (using :utc) or Time.local (using :local) when pulling diff --git a/activerecord/lib/active_record/railtie.rb b/activerecord/lib/active_record/railtie.rb index ec85b3c843..11b564f8f9 100644 --- a/activerecord/lib/active_record/railtie.rb +++ b/activerecord/lib/active_record/railtie.rb @@ -40,19 +40,7 @@ module ActiveRecord namespace :db do task :load_config do - configuration = if ENV["DATABASE_URL"] - { Rails.env => ENV["DATABASE_URL"] } - else - Rails.application.config.database_configuration || {} - end - - resolver = ActiveRecord::ConnectionAdapters::ConnectionSpecification::Resolver.new(configuration) - - configuration.each do |key, value| - configuration[key] = resolver.resolve(value) if value - end - - ActiveRecord::Tasks::DatabaseTasks.database_configuration = configuration + ActiveRecord::Tasks::DatabaseTasks.database_configuration = Rails.application.config.database_configuration if defined?(ENGINE_PATH) && engine = Rails::Engine.find(ENGINE_PATH) if engine.paths['db/migrate'].existent @@ -137,7 +125,7 @@ module ActiveRecord end end - self.configurations = app.config.database_configuration || {} + self.configurations = Rails.application.config.database_configuration establish_connection end end diff --git a/activerecord/lib/active_record/railties/databases.rake b/activerecord/lib/active_record/railties/databases.rake index 58dfa2c5a5..561387a179 100644 --- a/activerecord/lib/active_record/railties/databases.rake +++ b/activerecord/lib/active_record/railties/databases.rake @@ -2,7 +2,7 @@ require 'active_record' db_namespace = namespace :db do task :load_config do - ActiveRecord::Base.configurations = ActiveRecord::Tasks::DatabaseTasks.database_configuration || {} + ActiveRecord::Base.configurations = ActiveRecord::Tasks::DatabaseTasks.database_configuration || {} ActiveRecord::Migrator.migrations_paths = ActiveRecord::Tasks::DatabaseTasks.migrations_paths end diff --git a/activerecord/test/cases/connection_adapters/connection_handler_test.rb b/activerecord/test/cases/connection_adapters/connection_handler_test.rb index 3e33b30144..3365ad1294 100644 --- a/activerecord/test/cases/connection_adapters/connection_handler_test.rb +++ b/activerecord/test/cases/connection_adapters/connection_handler_test.rb @@ -2,6 +2,133 @@ require "cases/helper" module ActiveRecord module ConnectionAdapters + + class MergeAndResolveDefaultUrlConfigTest < ActiveRecord::TestCase + + def klass + ActiveRecord::ConnectionHandling::MergeAndResolveDefaultUrlConfig + end + + def setup + @previous_database_url = ENV.delete("DATABASE_URL") + end + + def teardown + ENV["DATABASE_URL"] = @previous_database_url if @previous_database_url + end + + def test_string_connection + config = { "production" => "postgres://localhost/foo" } + actual = klass.new(config).resolve + expected = { "production" => + { "adapter" => "postgresql", + "database" => "foo", + "host" => "localhost" + } + } + assert_equal expected, actual + end + + def test_url_sub_key + config = { "production" => { "url" => "postgres://localhost/foo" } } + actual = klass.new(config).resolve + expected = { "production" => + { "adapter" => "postgresql", + "database" => "foo", + "host" => "localhost" + } + } + assert_equal expected, actual + end + + def test_hash + config = { "production" => { "adapter" => "postgres", "database" => "foo" } } + actual = klass.new(config).resolve + assert_equal config, actual + end + + def test_blank + config = {} + actual = klass.new(config).resolve + assert_equal config, actual + end + + def test_blank_with_database_url + ENV['DATABASE_URL'] = "postgres://localhost/foo" + + config = {} + actual = klass.new(config).resolve + expected = { "adapter" => "postgresql", + "database" => "foo", + "host" => "localhost" } + assert_equal expected, actual["production"] + assert_equal expected, actual["development"] + assert_equal expected, actual["test"] + assert_equal nil, actual[:production] + assert_equal nil, actual[:development] + assert_equal nil, actual[:test] + end + + def test_sting_with_database_url + ENV['DATABASE_URL'] = "NOT-POSTGRES://localhost/NOT_FOO" + + config = { "production" => "postgres://localhost/foo" } + actual = klass.new(config).resolve + + expected = { "production" => + { "adapter" => "postgresql", + "database" => "foo", + "host" => "localhost" + } + } + assert_equal expected, actual + end + + def test_url_sub_key_with_database_url + ENV['DATABASE_URL'] = "NOT-POSTGRES://localhost/NOT_FOO" + + config = { "production" => { "url" => "postgres://localhost/foo" } } + actual = klass.new(config).resolve + expected = { "production" => + { "adapter" => "postgresql", + "database" => "foo", + "host" => "localhost" + } + } + assert_equal expected, actual + end + + def test_merge_no_conflicts_with_database_url + ENV['DATABASE_URL'] = "postgres://localhost/foo" + + config = {"production" => { "pool" => "5" } } + actual = klass.new(config).resolve + expected = { "production" => + { "adapter" => "postgresql", + "database" => "foo", + "host" => "localhost", + "pool" => "5" + } + } + assert_equal expected, actual + end + + def test_merge_conflicts_with_database_url + ENV['DATABASE_URL'] = "postgres://localhost/foo" + + config = {"production" => { "adapter" => "NOT-POSTGRES", "database" => "NOT-FOO", "pool" => "5" } } + actual = klass.new(config).resolve + expected = { "production" => + { "adapter" => "postgresql", + "database" => "foo", + "host" => "localhost", + "pool" => "5" + } + } + assert_equal expected, actual + end + end + class ConnectionHandlerTest < ActiveRecord::TestCase def setup @klass = Class.new(Base) { def self.name; 'klass'; end } diff --git a/guides/source/configuring.md b/guides/source/configuring.md index 272850d4c5..412aecadd5 100644 --- a/guides/source/configuring.md +++ b/guides/source/configuring.md @@ -455,14 +455,131 @@ There are a few configuration options available in Active Support: ### Configuring a Database -Just about every Rails application will interact with a database. The database to use is specified in a configuration file called `config/database.yml`. If you open this file in a new Rails application, you'll see a default database configured to use SQLite3. The file contains sections for three different environments in which Rails can run by default: +Just about every Rails application will interact with a database. You can connect to the database by setting an environment variable `ENV['DATABASE_URL']` or by using a configuration file called `config/database.yml`. + +Using the `config/database.yml` file you can specify all the information needed to access your database: + +```yaml +development: + adapter: postgresql + database: blog_development + pool: 5 +``` + +This will connect to the database named `blog_development` using the `postgresql` adapter. This same information can be stored in a URL and provided via an environment variable like this: + +```ruby +> puts ENV['DATABASE_URL'] +postgresql://localhost/blog_development?pool=5 +``` + +The `config/database.yml` file contains sections for three different environments in which Rails can run by default: * The `development` environment is used on your development/local computer as you interact manually with the application. * The `test` environment is used when running automated tests. * The `production` environment is used when you deploy your application for the world to use. +If you wish, you can manually specify a URL inside of your `config/database.yml` + +``` +development: + url: postgresql://localhost/blog_development?pool=5 +``` + +The `config/database.yml` file can contain ERB tags `<%= %>`. Anything in the tags will be evaluated as Ruby code. You can use this to pull out data from an environment variable or to perform calculations to generate the needed connection information. + + TIP: You don't have to update the database configurations manually. If you look at the options of the application generator, you will see that one of the options is named `--database`. This option allows you to choose an adapter from a list of the most used relational databases. You can even run the generator repeatedly: `cd .. && rails new blog --database=mysql`. When you confirm the overwriting of the `config/database.yml` file, your application will be configured for MySQL instead of SQLite. Detailed examples of the common database connections are below. + +### Connection Preference + +Since there are two ways to set your connection, via environment variable it is important to understand how the two can interact. + +If you have an empty `config/database.yml` file but your `ENV['DATABASE_URL']` is present, then Rails will connect to the database via your environment variable: + +``` +$ cat config/database.yml + +$ echo $DATABASE_URL +postgresql://localhost/my_database +``` + +If you have a `config/database.yml` but no `ENV['DATABASE_URL']` then this file will be used to connect to your database: + +``` +$ cat config/database.yml +development: + adapter: postgresql + database: my_database + host: localhost + +$ echo $DATABASE_URL +``` + +If you have both `config/database.yml` and `ENV['DATABASE_URL']` set then Rails will merge the configuration together. To better understand this we must see some examples. + +When duplicate connection information is provided the environment variable will take precedence: + +``` +$ cat config/database.yml +development: + adapter: sqlite3 + database: NOT_my_database + host: localhost + +$ echo $DATABASE_URL +postgresql://localhost/my_database + +$ rails runner 'puts ActiveRecord::Base.connections' +{"development"=>{"adapter"=>"postgresql", "host"=>"localhost", "database"=>"my_database"}} +``` + +Here the adapter, host, and database match the information in `ENV['DATABASE_URL']`. + +If non-duplicate information is provided you will get all unique values, environment variable still takes precedence in cases of any conflicts. + +``` +$ cat config/database.yml +development: + adapter: sqlite3 + pool: 5 + +$ echo $DATABASE_URL +postgresql://localhost/my_database + +$ rails runner 'puts ActiveRecord::Base.connections' +{"development"=>{"adapter"=>"postgresql", "host"=>"localhost", "database"=>"my_database", "pool"=>5}} +``` + +Since pool is not in the `ENV['DATABASE_URL']` provided connection information its information is merged in. Since `adapter` is duplicate, the `ENV['DATABASE_URL']` connection information wins. + +The only way to explicitly not use the connection information in `ENV['DATABASE_URL']` is to specify an explicit URL connectinon using the `"url"` sub key: + +``` +$ cat config/database.yml +development: + url: sqlite3://localhost/NOT_my_database + +$ echo $DATABASE_URL +postgresql://localhost/my_database + +$ rails runner 'puts ActiveRecord::Base.connections' +{"development"=>{"adapter"=>"sqlite3", "host"=>"localhost", "database"=>"NOT_my_database"}} +``` + +Here the connection information in `ENV['DATABASE_URL']` is ignored, note the different adapter and database name. + +Since it is possible to embed ERB in your `config/database.yml` it is best practice to explicitly show you are using the `ENV['DATABASE_URL']` to connect to your database. This is especially useful in production since you should not commit secrets like your database password into your source control (such as Git). + +``` +$ cat config/database.yml +production: + url: <%= ENV['DATABASE_URL'] %> +``` + +Now the behavior is clear, that we are only using the connection information in `ENV['DATABASE_URL']`. + #### Configuring an SQLite3 Database Rails comes with built-in support for [SQLite3](http://www.sqlite.org), which is a lightweight serverless database application. While a busy production environment may overload SQLite, it works well for development and testing. Rails defaults to using an SQLite database when creating a new project, but you can always change it later. diff --git a/railties/lib/rails/application/configuration.rb b/railties/lib/rails/application/configuration.rb index 9975bb8596..e902205a13 100644 --- a/railties/lib/rails/application/configuration.rb +++ b/railties/lib/rails/application/configuration.rb @@ -88,17 +88,23 @@ module Rails end end - # Loads and returns the configuration of the database. + # Loads and returns the entire raw configuration of database from + # values stored in `config/database.yml`. def database_configuration - yaml = paths["config/database"].first - if File.exist?(yaml) + yaml = Pathname.new(paths["config/database"].first || "") + + config = if yaml.exist? require "erb" - YAML.load ERB.new(IO.read(yaml)).result + YAML.load(ERB.new(yaml.read).result) || {} elsif ENV['DATABASE_URL'] - nil + # Value from ENV['DATABASE_URL'] is set to default database connection + # by Active Record. + {} else raise "Could not load database configuration. No such file - #{yaml}" end + + config rescue Psych::SyntaxError => e raise "YAML syntax error occurred while parsing #{paths["config/database"].first}. " \ "Please note that YAML must be consistently indented using spaces. Tabs are not allowed. " \ diff --git a/railties/lib/rails/commands/dbconsole.rb b/railties/lib/rails/commands/dbconsole.rb index c265ed8f36..f6d8aec30d 100644 --- a/railties/lib/rails/commands/dbconsole.rb +++ b/railties/lib/rails/commands/dbconsole.rb @@ -81,10 +81,11 @@ module Rails def config @config ||= begin - require APP_PATH - ActiveRecord::ConnectionAdapters::ConnectionSpecification::Resolver.new( - Rails.application.config.database_configuration || {} - ).resolve(ENV["DATABASE_URL"]) + if configurations[environment].blank? + raise ActiveRecord::AdapterNotSpecified, "'#{environment}' database is not configured. Available configuration: #{configurations.inspect}" + else + configurations[environment] + end end end @@ -98,6 +99,12 @@ module Rails protected + def configurations + require APP_PATH + ActiveRecord::Base.configurations = Rails.application.config.database_configuration + ActiveRecord::Base.configurations + end + def parse_arguments(arguments) options = {} diff --git a/railties/test/commands/dbconsole_test.rb b/railties/test/commands/dbconsole_test.rb index 7ad83a8b5d..24db395e6e 100644 --- a/railties/test/commands/dbconsole_test.rb +++ b/railties/test/commands/dbconsole_test.rb @@ -223,7 +223,7 @@ class Rails::DBConsoleTest < ActiveSupport::TestCase private def app_db_config(results) - Rails.application.config.stubs(:database_configuration).returns(results) + Rails.application.config.stubs(:database_configuration).returns(results || {}) end def dbconsole