1
0
Fork 0
mirror of https://github.com/rails/rails.git synced 2022-11-09 12:12:34 -05:00

Add option to skip joins for associations.

In a multiple database application, associations can't join across
databases. When set, this option tells Rails to make 2 or more queries
rather than using joins for associations.

Set the option on a has many through association:

```ruby
class Dog
  has_many :treats, through: :humans, disable_joins: true
  has_many :humans
end
```

Then instead of generating join SQL, two queries are used for `@dog.treats`:

```
SELECT "humans"."id" FROM "humans" WHERE "humans"."dog_id" = ?  [["dog_id", 1]]
SELECT "treats".* FROM "treats" WHERE "treats"."human_id" IN (?, ?, ?)  [["human_id", 1], ["human_id", 2], ["human_id", 3]]
```

This code is extracted from a gem we use internally at GitHub which
means the implementation here is used in production daily and isn't
experimental.

I often get the question "why can't Rails do this automatically" so I
figured I'd include the answer in the commit. Rails can't do this
automatically because associations are lazily loaded. `dog.treats` needs
to load `Dog`, then `Human` and then `Treats`. When `dog.treats` is
called Rails pre-generates the SQL that will be run and puts that
information into a reflection object. Because the SQL parts are pre-generated,
as soon as `dog.treats` is loaded it's too late to skip a join. The join
is already available on the object and that join is what's run to load
`treats` from `dog` through `humans`. I think the only way to avoid setting
an option on the association is to rewrite how and when the SQL is
generated for associations which is a large undertaking. Basically the
way that Active Record associations are designed, it is currently
impossible to have Rails figure out to not join (loading the association
will cause the join to occur, and that join will raise an error if the
models don't live in the same db).

The original implementation was written by me and Aaron. Lee helped port
over tests, and I refactored the extraction to better match Rails style.

Co-authored-by: Lee Quarella <leequarella@gmail.com>
Co-authored-by: Aaron Patterson <aaron@rubyonrails.org>
This commit is contained in:
eileencodes 2020-11-03 13:01:41 -05:00
parent 2e14c53fc6
commit de6b4efa3e
No known key found for this signature in database
GPG key ID: BA5C575120BBE8DF
17 changed files with 425 additions and 11 deletions

View file

@ -1,3 +1,27 @@
* Add option to disable joins for associations.
In a multiple database application, associations can't join across
databases. When set, this option instructs Rails to generate 2 or
more queries rather than generating joins for associations.
Set the option on a has many through association:
```ruby
class Dog
has_many :treats, through: :humans, disable_joins: true
has_many :humans
end
```
Then instead of generating join SQL, two queries are used for `@dog.treats`:
```
SELECT "humans"."id" FROM "humans" WHERE "humans"."dog_id" = ? [["dog_id", 1]]
SELECT "treats".* FROM "treats" WHERE "treats"."human_id" IN (?, ?, ?) [["human_id", 1], ["human_id", 2], ["human_id", 3]]
```
*Eileen M. Uchitelle*, *Aaron Patterson*, *Lee Quarella*
* Add setting for enumerating column names in SELECT statements.
Adding a column to a PostgresSQL database, for example, while the application is running can

View file

@ -93,6 +93,7 @@ module ActiveRecord
autoload :Relation
autoload :AssociationRelation
autoload :DisableJoinsAssociationRelation
autoload :NullRelation
autoload_under "relation" do

View file

@ -293,6 +293,7 @@ module ActiveRecord
autoload :Preloader
autoload :JoinDependency
autoload :AssociationScope
autoload :DisableJoinsAssociationScope
autoload :AliasTracker
end
@ -1396,6 +1397,11 @@ module ActiveRecord
# of association, including other <tt>:through</tt> associations. Options for <tt>:class_name</tt>,
# <tt>:primary_key</tt> and <tt>:foreign_key</tt> are ignored, as the association uses the
# source reflection.
# [:disable_joins]
# Specifies whether joins should be skipped for an association. If set to true, two or more queries
# will be generated. Note that in some cases, if order or limit is applied, it will be done in-memory
# due to database limitions. This option is only applicable on `has_many :through` associations as
# `has_many` alone do not perform a join.
#
# If the association on the join model is a #belongs_to, the collection can be modified
# and the records on the <tt>:through</tt> model will be automatically created and removed
@ -1451,6 +1457,7 @@ module ActiveRecord
# has_many :tags, as: :taggable
# has_many :reports, -> { readonly }
# has_many :subscribers, through: :subscriptions, source: :user
# has_many :subscribers, through: :subscriptions, disable_joins: true
# has_many :comments, strict_loading: true
def has_many(name, scope = nil, **options, &extension)
reflection = Builder::HasMany.build(self, name, scope, options, &extension)

View file

@ -33,7 +33,7 @@ module ActiveRecord
# <tt>owner</tt>, the collection of its posts as <tt>target</tt>, and
# the <tt>reflection</tt> object represents a <tt>:has_many</tt> macro.
class Association #:nodoc:
attr_reader :owner, :target, :reflection
attr_reader :owner, :target, :reflection, :disable_joins
delegate :options, to: :reflection
@ -41,6 +41,7 @@ module ActiveRecord
reflection.check_validity!
@owner, @reflection = owner, reflection
@disable_joins = @reflection.options[:disable_joins] || false
reset
reset_scope
@ -97,7 +98,9 @@ module ActiveRecord
end
def scope
if (scope = klass.current_scope) && scope.try(:proxy_association) == self
if disable_joins
DisableJoinsAssociationScope.create.scope(self)
elsif (scope = klass.current_scope) && scope.try(:proxy_association) == self
scope.spawn
elsif scope = klass.global_current_scope
target_scope.merge!(association_scope).merge!(scope)
@ -250,7 +253,11 @@ module ActiveRecord
# actually gets built.
def association_scope
if klass
@association_scope ||= AssociationScope.scope(self)
@association_scope ||= if disable_joins
DisableJoinsAssociationScope.scope(self)
else
AssociationScope.scope(self)
end
end
end

View file

@ -11,6 +11,7 @@ module ActiveRecord::Associations::Builder # :nodoc:
valid += [:as, :foreign_type] if options[:as]
valid += [:through, :source, :source_type] if options[:through]
valid += [:ensuring_owner_was] if options[:dependent] == :destroy_async
valid += [:disable_joins] if options[:disable_joins] && options[:through]
valid
end

View file

@ -0,0 +1,52 @@
# frozen_string_literal: true
module ActiveRecord
module Associations
class DisableJoinsAssociationScope < AssociationScope # :nodoc:
def scope(association)
source_reflection = association.reflection
owner = association.owner
unscoped = association.klass.unscoped
reverse_chain = get_chain(source_reflection, association, unscoped.alias_tracker).reverse
last_reflection, last_ordered, last_join_ids = last_scope_chain(reverse_chain, owner)
add_constraints(last_reflection, last_reflection.join_primary_key, last_join_ids, owner, last_ordered)
end
private
def last_scope_chain(reverse_chain, owner)
first_scope = [reverse_chain.shift, false, [owner.id]]
reverse_chain.inject(first_scope) do |(reflection, ordered, join_ids), next_reflection|
key = reflection.join_primary_key
records = add_constraints(reflection, key, join_ids, owner, ordered)
foreign_key = next_reflection.join_foreign_key
record_ids = records.pluck(foreign_key)
records_ordered = records && records.order_values.any?
[next_reflection, records_ordered, record_ids]
end
end
def add_constraints(reflection, key, join_ids, owner, ordered)
scope = reflection.build_scope(reflection.aliased_table).where(key => join_ids)
scope = reflection.constraints.inject(scope) do |memo, scope_chain_item|
item = eval_scope(reflection, scope_chain_item, owner)
scope.unscope!(*item.unscope_values)
scope.where_clause += item.where_clause
scope.order_values = item.order_values | scope.order_values
scope
end
if scope.order_values.empty? && ordered
split_scope = DisableJoinsAssociationRelation.create(scope.klass, key, join_ids)
split_scope.where_clause += scope.where_clause
split_scope
else
scope
end
end
end
end
end

View file

@ -214,6 +214,7 @@ module ActiveRecord
def find_target
return [] unless target_reflection_has_associated_record?
return scope.to_a if disable_joins
super
end

View file

@ -97,6 +97,8 @@ module ActiveRecord
scope = through_reflection.klass.unscoped
options = reflection.options
return scope if options[:disable_joins]
values = reflection_scope.values
if annotations = values[:annotate]
scope.annotate!(*annotations)

View file

@ -0,0 +1,41 @@
# frozen_string_literal: true
module ActiveRecord
class DisableJoinsAssociationRelation < Relation # :nodoc:
TOO_MANY_RECORDS = 5000
attr_reader :ids, :key
def initialize(klass, key, ids)
@ids = ids.uniq
@key = key
super(klass)
end
def limit(value)
records.take(value)
end
def first(limit = nil)
if limit
records.limit(limit).first
else
records.first
end
end
def load
super
records = @records
records_by_id = records.group_by do |record|
record[key]
end
records = ids.flat_map { |id| records_by_id[id.to_i] }
records.compact!
@records = records
end
end
end

View file

@ -15,7 +15,8 @@ module ActiveRecord
[
ActiveRecord::Relation,
ActiveRecord::Associations::CollectionProxy,
ActiveRecord::AssociationRelation
ActiveRecord::AssociationRelation,
ActiveRecord::DisableJoinsAssociationRelation
].each do |klass|
delegate = Class.new(klass) {
include ClassSpecificRelation

View file

@ -0,0 +1,196 @@
# frozen_string_literal: true
require "cases/helper"
require "models/post"
require "models/author"
require "models/comment"
require "models/rating"
require "models/member"
require "models/member_type"
require "models/pirate"
require "models/treasure"
require "models/hotel"
require "models/department"
class HasManyThroughDisableJoinsAssociationsTest < ActiveRecord::TestCase
fixtures :posts, :authors, :comments, :pirates
def setup
@author = authors(:mary)
@post = @author.posts.create(title: "title", body: "body")
@member_type = MemberType.create(name: "club")
@member = Member.create(member_type: @member_type)
@comment = @post.comments.create(body: "text", origin: @member)
@post2 = @author.posts.create(title: "title", body: "body")
@member2 = Member.create(member_type: @member_type)
@comment2 = @post2.comments.create(body: "text", origin: @member2)
@rating1 = @comment.ratings.create(value: 8)
@rating2 = @comment.ratings.create(value: 9)
end
def test_counting_on_disable_joins_through
assert_equal @author.comments.count, @author.no_joins_comments.count
assert_queries(2) { @author.no_joins_comments.count }
assert_queries(1) { @author.comments.count }
end
def test_counting_on_disable_joins_through_using_custom_foreign_key
assert_equal @author.comments_with_foreign_key.count, @author.no_joins_comments_with_foreign_key.count
assert_queries(2) { @author.no_joins_comments_with_foreign_key.count }
assert_queries(1) { @author.comments_with_foreign_key.count }
end
def test_pluck_on_disable_joins_through
assert_equal @author.comments.pluck(:id), @author.no_joins_comments.pluck(:id)
assert_queries(2) { @author.no_joins_comments.pluck(:id) }
assert_queries(1) { @author.comments.pluck(:id) }
end
def test_pluck_on_disable_joins_through_using_custom_foreign_key
assert_equal @author.comments_with_foreign_key.pluck(:id), @author.no_joins_comments_with_foreign_key.pluck(:id)
assert_queries(2) { @author.no_joins_comments_with_foreign_key.pluck(:id) }
assert_queries(1) { @author.comments_with_foreign_key.pluck(:id) }
end
def test_fetching_on_disable_joins_through
assert_equal @author.comments.first.id, @author.no_joins_comments.first.id
assert_queries(2) { @author.no_joins_comments.first.id }
assert_queries(1) { @author.comments.first.id }
end
def test_fetching_on_disable_joins_through_using_custom_foreign_key
assert_equal @author.comments_with_foreign_key.first.id, @author.no_joins_comments_with_foreign_key.first.id
assert_queries(2) { @author.no_joins_comments_with_foreign_key.first.id }
assert_queries(1) { @author.comments_with_foreign_key.first.id }
end
def test_to_a_on_disable_joins_through
assert_equal @author.comments.to_a, @author.no_joins_comments.to_a
@author.reload
assert_queries(2) { @author.no_joins_comments.to_a }
assert_queries(1) { @author.comments.to_a }
end
def test_appending_on_disable_joins_through
assert_difference(->() { @author.no_joins_comments.reload.size }) do
@post.comments.create(body: "text")
end
assert_queries(2) { @author.no_joins_comments.reload.size }
assert_queries(1) { @author.comments.reload.size }
end
def test_appending_on_disable_joins_through_using_custom_foreign_key
assert_difference(->() { @author.no_joins_comments_with_foreign_key.reload.size }) do
@post.comments.create(body: "text")
end
assert_queries(2) { @author.no_joins_comments_with_foreign_key.reload.size }
assert_queries(1) { @author.comments_with_foreign_key.reload.size }
end
def test_empty_on_disable_joins_through
empty_author = authors(:bob)
assert_equal [], assert_queries(0) { empty_author.comments.all }
assert_equal [], assert_queries(1) { empty_author.no_joins_comments.all }
end
def test_empty_on_disable_joins_through_using_custom_foreign_key
empty_author = authors(:bob)
assert_equal [], assert_queries(0) { empty_author.comments_with_foreign_key.all }
assert_equal [], assert_queries(1) { empty_author.no_joins_comments_with_foreign_key.all }
end
def test_pluck_on_disable_joins_through_a_through
rating_ids = Rating.where(comment: @comment).pluck(:id)
assert_equal rating_ids, assert_queries(1) { @author.ratings.pluck(:id) }
assert_equal rating_ids, assert_queries(3) { @author.no_joins_ratings.pluck(:id) }
end
def test_count_on_disable_joins_through_a_through
ratings_count = Rating.where(comment: @comment).count
assert_equal ratings_count, assert_queries(1) { @author.ratings.count }
assert_equal ratings_count, assert_queries(3) { @author.no_joins_ratings.count }
end
def test_count_on_disable_joins_using_relation_with_scope
assert_equal 2, assert_queries(1) { @author.good_ratings.count }
assert_equal 2, assert_queries(3) { @author.no_joins_good_ratings.count }
end
def test_to_a_on_disable_joins_with_multiple_scopes
assert_equal [@rating1, @rating2], assert_queries(1) { @author.good_ratings.to_a }
assert_equal [@rating1, @rating2], assert_queries(3) { @author.no_joins_good_ratings.to_a }
end
def test_preloading_has_many_through_disable_joins
assert_queries(3) { Author.all.preload(:good_ratings).map(&:good_ratings) }
assert_queries(4) { Author.all.preload(:no_joins_good_ratings).map(&:good_ratings) }
end
def test_polymophic_disable_joins_through_counting
assert_equal 2, assert_queries(1) { @author.ordered_members.count }
assert_equal 2, assert_queries(3) { @author.no_joins_ordered_members.count }
end
def test_polymophic_disable_joins_through_ordering
assert_equal [@member2, @member], assert_queries(1) { @author.ordered_members.to_a }
assert_equal [@member2, @member], assert_queries(3) { @author.no_joins_ordered_members.to_a }
end
def test_polymorphic_disable_joins_through_reordering
assert_equal [@member, @member2], assert_queries(1) { @author.ordered_members.reorder(id: :asc).to_a }
assert_equal [@member, @member2], assert_queries(3) { @author.no_joins_ordered_members.reorder(id: :asc).to_a }
end
def test_polymorphic_disable_joins_through_ordered_scopes
assert_equal [@member2, @member], assert_queries(1) { @author.ordered_members.unnamed.to_a }
assert_equal [@member2, @member], assert_queries(3) { @author.no_joins_ordered_members.unnamed.to_a }
end
def test_polymorphic_disable_joins_through_ordered_chained_scopes
member3 = Member.create(member_type: @member_type)
member4 = Member.create(member_type: @member_type, name: "named")
@post2.comments.create(body: "text", origin: member3)
@post2.comments.create(body: "text", origin: member4)
assert_equal [member3, @member2, @member], assert_queries(1) { @author.ordered_members.unnamed.with_member_type_id(@member_type.id).to_a }
assert_equal [member3, @member2, @member], assert_queries(3) { @author.no_joins_ordered_members.unnamed.with_member_type_id(@member_type.id).to_a }
end
def test_polymorphic_disable_joins_through_ordered_scope_limits
assert_equal [@member2], assert_queries(1) { @author.ordered_members.unnamed.limit(1).to_a }
assert_equal [@member2], assert_queries(3) { @author.no_joins_ordered_members.unnamed.limit(1).to_a }
end
def test_polymorphic_disable_joins_through_ordered_scope_first
assert_equal @member2, assert_queries(1) { @author.ordered_members.unnamed.first }
assert_equal @member2, assert_queries(3) { @author.no_joins_ordered_members.unnamed.first }
end
def test_order_applied_in_double_join
assert_equal [@member2, @member], assert_queries(1) { @author.members.to_a }
assert_equal [@member2, @member], assert_queries(3) { @author.no_joins_members.to_a }
end
def test_first_and_scope_applied_in_double_join
assert_equal @member2, assert_queries(1) { @author.members.unnamed.first }
assert_equal @member2, assert_queries(3) { @author.no_joins_members.unnamed.first }
end
def test_first_and_scope_in_double_join_applies_order_in_memory
disable_joins_sql = capture_sql { @author.no_joins_members.unnamed.first }
assert_no_match(/ORDER BY/, disable_joins_sql.last)
end
def test_limit_and_scope_applied_in_double_join
assert_equal [@member2], assert_queries(1) { @author.members.unnamed.limit(1).to_a }
assert_equal [@member2], assert_queries(3) { @author.no_joins_members.unnamed.limit(1) }
end
def test_limit_and_scope_in_double_join_applies_limit_in_memory
disable_joins_sql = capture_sql { @author.no_joins_members.unnamed.first }
assert_no_match(/LIMIT 1/, disable_joins_sql.last)
end
end

View file

@ -20,6 +20,50 @@ class Author < ActiveRecord::Base
Rating.joins(:comment).merge(self)
end
end
has_many :comments_with_order, -> { ordered_by_post_id }, through: :posts, source: :comments
has_many :no_joins_comments, through: :posts, disable_joins: :true, source: :comments
has_many :comments_with_foreign_key, through: :posts, source: :comments, foreign_key: :post_id
has_many :no_joins_comments_with_foreign_key, through: :posts, disable_joins: :true, source: :comments, foreign_key: :post_id
has_many :members,
through: :comments_with_order,
source: :origin,
source_type: "Member"
has_many :no_joins_members,
through: :comments_with_order,
source: :origin,
source_type: "Member",
disable_joins: true
has_many :ordered_members,
-> { order(id: :desc) },
through: :comments_with_order,
source: :origin,
source_type: "Member"
has_many :no_joins_ordered_members,
-> { order(id: :desc) },
through: :comments_with_order,
source: :origin,
source_type: "Member",
disable_joins: true
has_many :ratings, through: :comments
has_many :good_ratings,
-> { where("ratings.value > 5") },
through: :comments,
source: :ratings
has_many :no_joins_ratings, through: :no_joins_comments, disable_joins: :true, source: :ratings
has_many :no_joins_good_ratings,
-> { where("ratings.value > 5") },
through: :comments,
source: :ratings,
disable_joins: true
has_many :comments_containing_the_letter_e, through: :posts, source: :comments
has_many :comments_with_order_and_conditions, -> { order("comments.body").where("comments.body like 'Thank%'") }, through: :posts, source: :comments
has_many :comments_with_include, -> { includes(:post).where(posts: { type: "Post" }) }, through: :posts, source: :comments

View file

@ -10,10 +10,12 @@ class Comment < ActiveRecord::Base
scope :for_first_post, -> { where(post_id: 1) }
scope :for_first_author, -> { joins(:post).where("posts.author_id" => 1) }
scope :created, -> { all }
scope :ordered_by_post_id, -> { order("comments.post_id DESC") }
belongs_to :post, counter_cache: true
belongs_to :author, polymorphic: true
belongs_to :resource, polymorphic: true
belongs_to :origin, polymorphic: true
belongs_to :company, foreign_key: "company"
has_many :ratings

View file

@ -37,6 +37,9 @@ class Member < ActiveRecord::Base
belongs_to :admittable, polymorphic: true
has_one :premium_club, through: :admittable
scope :unnamed, -> { where(name: nil) }
scope :with_member_type_id, -> (id) { where(member_type_id: id) }
end
class SelfMember < ActiveRecord::Base

View file

@ -30,6 +30,7 @@ class Post < ActiveRecord::Base
scope :containing_the_letter_a, -> { where("body LIKE '%a%'") }
scope :titled_with_an_apostrophe, -> { where("title LIKE '%''%'") }
scope :ranked_by_comments, -> { order(table[:comments_count].desc) }
scope :ordered_by_post_id, -> { order("posts.post_id ASC") }
scope :limit_by, lambda { |l| limit(l) }
scope :locked, -> { lock }

View file

@ -235,6 +235,8 @@ ActiveRecord::Schema.define do
# See #14855.
t.string :resource_id
t.string :resource_type
t.integer :origin_id
t.string :origin_type
t.integer :developer_id
t.datetime :updated_at
t.datetime :deleted_at

View file

@ -32,7 +32,6 @@ databases
The following features are not (yet) supported:
* Automatic swapping for horizontal sharding
* Joining across clusters
* Load balancing replicas
* Dumping schema caches for multiple databases
@ -460,6 +459,42 @@ end
`ActiveRecord::Base.connected_to` maintains the ability to switch
connections globally.
### Handling associations with joins across databases
As of Rails 7.0+, Active Record has an option for handling associations that would perform
a join across multiple databases. If you have a has many through association that you want to
disable joining and perform 2 or more queries, pass the `disable_joins: true` option.
For example:
```ruby
class Dog < AnimalsRecord
has_many :treats, through: :humans, disable_joins: true
has_many :humans
end
```
Previously calling `@dog.treats` without `disable_joins` would raise an error because databases are unable
to handle joins across clusters. With the `disable_joins` option, Rails will generate multiple select queries
to avoid attempting joining across clusters. For the above association `@dog.treats` would generate the
following SQL:
```sql
SELECT "humans"."id" FROM "humans" WHERE "humans"."dog_id" = ? [["dog_id", 1]]
SELECT "treats".* FROM "treats" WHERE "treats"."human_id" IN (?, ?, ?) [["human_id", 1], ["human_id", 2], ["human_id", 3]]
```
There are some important things to be aware of with this option:
1) There may be performance implications since now two or more queries will be performed (depending
on the association) rather than a join. If the select for `humans` returned a high number of IDs
the select for `treats` may send too many IDs.
2) Since we are no longer performing joins a query with an order or limit is now sorted in-memory since
order from one table cannot be applied to another table.
3) This setting must be added to all associations that you want joining to be disabled.
Rails can't guess this for you because association loading is lazy, to load `treats` in `@dog.treats`
Rails already needs to know what SQL should be generated.
## Caveats
### Automatic swapping for horizontal sharding
@ -475,12 +510,6 @@ dependent on your infrastructure. We may implement basic, primitive load balanci
in the future, but for an application at scale this should be something your application
handles outside of Rails.
### Joining Across Databases
Applications cannot join across databases. At the moment applications will need to
manually write two selects and split the joins themselves. In a future version Rails
will split the joins for you.
### Schema Cache
If you use a schema cache and multiple databases, you'll need to write an initializer