From 41f80e494af3ce7c8f3d6aa34f8303524a48373b Mon Sep 17 00:00:00 2001 From: Nick Kallen Date: Tue, 20 May 2008 10:11:07 -0700 Subject: [PATCH] limits and offsets need to be externalized too. first draft --- README | 82 ++++++++++++++++--- lib/arel/predicates.rb | 7 +- lib/arel/relations/operations/group.rb | 2 +- lib/arel/relations/operations/join.rb | 4 +- lib/arel/relations/operations/project.rb | 4 +- lib/arel/relations/operations/skip.rb | 4 + lib/arel/relations/operations/take.rb | 4 + lib/arel/relations/utilities/aggregation.rb | 8 +- lib/arel/relations/utilities/compound.rb | 2 +- lib/arel/relations/writes/delete.rb | 2 +- lib/arel/relations/writes/update.rb | 2 +- .../joins/with_aggregations_spec.rb | 41 ++++++---- spec/arel/unit/relations/project_spec.rb | 6 +- 13 files changed, 121 insertions(+), 47 deletions(-) diff --git a/README b/README index d870f06653..e979dbc2a3 100644 --- a/README +++ b/README @@ -18,7 +18,7 @@ Generating a query with ARel is simple. For example, in order to produce you construct a table relation and convert it to sql: - users = Arel::Table.new(:users) + users = Arel(:users) users.to_sql In fact, you will probably never call `#to_sql`. Rather, you'll work with data from the table directly. You can iterate through all rows in the `users` table like this: @@ -79,16 +79,23 @@ The `OR` operator is not yet supported. It will work like this: The `AND` operator will behave similarly. +Finally, most operations take a block form. For example: + + Arel(:users) \ + .where { |u| u[:id].eq(1) } \ + .project { |u| u[:id] } + +This provides a (sometimes) convenient alternative syntax. + ### The Crazy Features ### The examples above are fairly simple and other libraries match or come close to matching the expressiveness of Arel (e.g., `Sequel` in Ruby). - #### Complex Joins #### Where Arel really shines in its ability to handle complex joins and aggregations. As a first example, let's consider an "adjacency list", a tree represented in a table. Suppose we have a table `comments`, representing a threaded discussion: - comments = Arel::Table.new(:comments) + comments = Arel(:comments) And this table has the following attributes: @@ -101,24 +108,77 @@ The `parent_id` column is a foreign key from the `comments` table to itself. Now comments.join(replies).on(replies[:parent_id].eq(comments[:id])) # => SELECT * FROM comments INNER JOIN comments AS comments_2 WHERE comments_2.parent_id = comments.id -Arel will always produce a unique name for every table joined in the relation, and it will always do so deterministically to exploit query caching. Typically, the problem with automated table aliasing is that extracting data out of the result set when everything has a random name is quite hard. Arel makes this simple: to get just certain columns from the result set, treat a row like a hash: +The call to `#alias` is actually optional: Arel will always produce a unique name for every table joined in the relation, and it will always do so deterministically to exploit query caching. Explicit aliasing is more common, however. When you want to extract specific slices of data, aliased tables are a necessity. For example to get just certain columns from the row, treat a row like a hash: comments_with_replies.first[replies[:body]] This will return the first comment's reply's body. -Arel can actually perform the aliasing automatically, without the need for the programmer to explicitly call `alias`. However, this makes it difficult to specify the join condition: +If you don't need to extract the data later (for example, you're simply doing a join to find comments that have replies, you don't care what the content of the replies are), the block form may be preferable: + + comments.join(comments) { |comments, replies| replies[:parent_id].eq(comments[:id]) } + # => SELECT * FROM comments INNER JOIN comments AS comments_2 WHERE comments_2.parent_id = comments.id + +Note that you do NOT want to do something like: comments.join(comments, comments[:parent_id].eq(comments[:id])) # => SELECT * FROM comments INNER JOIN comments AS comments_2 WHERE comments.parent_id = comments.id -This does NOT have the same meaning as the previous query. As an alternative to aliasing, there is a convenient block form: - - comments.join(comments) { |comments, replies| replies[:parent_id].eq(comments[:id]) } - -Of course, without the `alias`, you will have a harder time extracting `replies` data from a row. +This does NOT have the same meaning as the previous query, since the comments[:parent_id] reference is effectively ambiguous. #### Complex Aggregations #### -My personal favorite feature of Arel, and certainly the most difficult to implement, is closure under joining even in the presence of aggregations. This is a feature where the Relational Algebra is fundamentally easier to use than SQL. +My personal favorite feature of Arel, certainly the most difficult to implement, and possibly only of marginal value, is **closure under joining even in the presence of aggregations**. This is a feature where the Relational Algebra is fundamentally easier to use than SQL. Think of this as a preview of the kind of radical functionality that is to come, stuff no other "ORM" is doing. +The easiest way to introduce this is in SQL. Your task is to get all users and the **count** of their associated photos. Let's start from the inside out: + + SELECT count(*) + FROM photos + GROUP BY user_id + +Now, we'd like to join this with the user table. Naively, you might try to do this: + + SELECT users.*, count(photos.id) + FROM users + LEFT OUTER JOIN photos + ON users.id = photos.id + GROUP BY photos.user_id + +Of course, this has a slightly different meaning than our intended query. This is actually a fairly advanced topic in SQL so let's see why this doesn't work *step by step*. Suppose we have these records in our `users` table: + + mysql> select * from users; + +------+--------+ + | id | name | + +------+--------+ + | 1 | hai | + | 2 | bai | + | 3 | dumpty | + +------+--------+ + +And these in the photos table: + + mysql> select * from photos; + +------+---------+-----------+ + | id | user_id | camera_id | + +------+---------+-----------+ + | 1 | 1 | 1 | + | 2 | 1 | 1 | + | 3 | 1 | 1 | + +------+---------+-----------+ + +If we perform the above, incorrect query, we get the following: + + mysql> select users.*, count(photos.id) from users left outer join photos on users.id = photos.user_id limit 3 group by user_id; + +------+------+------------------+ + | id | name | count(photos.id) | + +------+------+------------------+ + | 2 | bai | 0 | + | 1 | hai | 3 | + +------+------+------------------+ + +As you can see, we're completely missing data for user with id 3. `dumpty` has no photos, neither does `bai`. But strangely `bai` appeared and `dumpty` didn't! The reason is that the `GROUP BY` clause is aggregating on both tables, not just the `photos` table. All users without photos have a `photos.id` of `null` (thanks to the left outer join). These are rolled up together and an arbitrary user wins. In this case, `bai` not `dumpty`. + + SELECT users.*, photos_aggregation.cnt + FROM users + LEFT OUTER JOIN (SELECT user_id, count(*) as cnt FROM photos GROUP BY user_id) AS photos_aggregation + ON photos_aggregation.user_id = users.id \ No newline at end of file diff --git a/lib/arel/predicates.rb b/lib/arel/predicates.rb index f21376d4c9..a83bad3c22 100644 --- a/lib/arel/predicates.rb +++ b/lib/arel/predicates.rb @@ -3,11 +3,8 @@ module Arel end class Binary < Predicate - attr_reader :operand1, :operand2 - - def initialize(operand1, operand2) - @operand1, @operand2 = operand1, operand2 - end + attributes :operand1, :operand2 + deriving :initialize def ==(other) self.class === other and diff --git a/lib/arel/relations/operations/group.rb b/lib/arel/relations/operations/group.rb index 22af2734a6..253c4215b6 100644 --- a/lib/arel/relations/operations/group.rb +++ b/lib/arel/relations/operations/group.rb @@ -8,7 +8,7 @@ module Arel @groupings = (groupings + (block_given?? [yield(self)] : [])).collect { |g| g.bind(relation) } end - def aggregation? + def externalizable? true end end diff --git a/lib/arel/relations/operations/join.rb b/lib/arel/relations/operations/join.rb index 243a0289c7..8025db095e 100644 --- a/lib/arel/relations/operations/join.rb +++ b/lib/arel/relations/operations/join.rb @@ -39,8 +39,8 @@ module Arel end # TESTME - def aggregation? - relation1.aggregation? or relation2.aggregation? + def externalizable? + relation1.externalizable? or relation2.externalizable? end def join? diff --git a/lib/arel/relations/operations/project.rb b/lib/arel/relations/operations/project.rb index 2be87fe694..c92a9df5a5 100644 --- a/lib/arel/relations/operations/project.rb +++ b/lib/arel/relations/operations/project.rb @@ -12,8 +12,8 @@ module Arel @attributes ||= projections.collect { |p| p.bind(self) } end - def aggregation? - attributes.any?(&:aggregation?) + def externalizable? + attributes.any?(&:aggregation?) or relation.externalizable? end end end \ No newline at end of file diff --git a/lib/arel/relations/operations/skip.rb b/lib/arel/relations/operations/skip.rb index ea5df21f53..930e4c94ea 100644 --- a/lib/arel/relations/operations/skip.rb +++ b/lib/arel/relations/operations/skip.rb @@ -2,5 +2,9 @@ module Arel class Skip < Compound attributes :relation, :skipped deriving :initialize, :== + + def externalizable? + true + end end end \ No newline at end of file diff --git a/lib/arel/relations/operations/take.rb b/lib/arel/relations/operations/take.rb index 095e430417..2fd3fdf635 100644 --- a/lib/arel/relations/operations/take.rb +++ b/lib/arel/relations/operations/take.rb @@ -2,5 +2,9 @@ module Arel class Take < Compound attributes :relation, :taken deriving :initialize, :== + + def externalizable? + true + end end end \ No newline at end of file diff --git a/lib/arel/relations/utilities/aggregation.rb b/lib/arel/relations/utilities/aggregation.rb index 9f5ead8f86..bdc7650a20 100644 --- a/lib/arel/relations/utilities/aggregation.rb +++ b/lib/arel/relations/utilities/aggregation.rb @@ -1,5 +1,5 @@ module Arel - class Aggregation < Compound + class Externalization < Compound attributes :relation deriving :initialize, :== include Recursion::BaseCase @@ -17,16 +17,16 @@ module Arel end def name - relation.name + '_aggregation' + relation.name + '_external' end end class Relation def externalize - @externalized ||= aggregation?? Aggregation.new(self) : self + @externalized ||= externalizable?? Externalization.new(self) : self end - def aggregation? + def externalizable? false end end diff --git a/lib/arel/relations/utilities/compound.rb b/lib/arel/relations/utilities/compound.rb index 23a55d4b5b..a91cec1127 100644 --- a/lib/arel/relations/utilities/compound.rb +++ b/lib/arel/relations/utilities/compound.rb @@ -2,7 +2,7 @@ module Arel class Compound < Relation attr_reader :relation hash_on :relation - delegate :joins, :join?, :inserts, :taken, :skipped, :name, :aggregation?, + delegate :joins, :join?, :inserts, :taken, :skipped, :name, :externalizable?, :column_for, :engine, :table, :table_sql, :to => :relation diff --git a/lib/arel/relations/writes/delete.rb b/lib/arel/relations/writes/delete.rb index e72679c4d6..318a299b8b 100644 --- a/lib/arel/relations/writes/delete.rb +++ b/lib/arel/relations/writes/delete.rb @@ -8,7 +8,7 @@ module Arel "DELETE", "FROM #{table_sql}", ("WHERE #{wheres.collect(&:to_sql).join('\n\tAND ')}" unless wheres.blank? ), - ("LIMIT #{taken}" unless taken.blank? ), + ("LIMIT #{taken}" unless taken.blank? ), ].compact.join("\n") end diff --git a/lib/arel/relations/writes/update.rb b/lib/arel/relations/writes/update.rb index 18b7ad9de1..720b9d697d 100644 --- a/lib/arel/relations/writes/update.rb +++ b/lib/arel/relations/writes/update.rb @@ -14,7 +14,7 @@ module Arel "#{value.format(attribute)} = #{attribute.format(value)}" end.join(",\n"), ("WHERE #{wheres.collect(&:to_sql).join('\n\tAND ')}" unless wheres.blank? ), - ("LIMIT #{taken}" unless taken.blank? ) + ("LIMIT #{taken}" unless taken.blank? ) ].join("\n") end diff --git a/spec/arel/integration/joins/with_aggregations_spec.rb b/spec/arel/integration/joins/with_aggregations_spec.rb index 9e28237c3c..d429806094 100644 --- a/spec/arel/integration/joins/with_aggregations_spec.rb +++ b/spec/arel/integration/joins/with_aggregations_spec.rb @@ -16,13 +16,22 @@ module Arel end describe '#to_sql' do + it '' do + @relation1.join(@relation2.take(3)).on(@predicate).to_sql.should be_like(" + SELECT `users`.`id`, `users`.`name`, `photos_external`.`id`, `photos_external`.`user_id`, `photos_external`.`camera_id` + FROM `users` + INNER JOIN (SELECT `photos`.`id`, `photos`.`user_id`, `photos`.`camera_id` FROM `photos` LIMIT 3) AS `photos_external` + ON `users`.`id` = `photos_external`.`user_id` + ") + end + describe 'with the aggregation on the right' do it 'manufactures sql joining the left table to a derived table' do @relation1.join(@aggregation).on(@predicate).to_sql.should be_like(" - SELECT `users`.`id`, `users`.`name`, `photos_aggregation`.`user_id`, `photos_aggregation`.`cnt` + SELECT `users`.`id`, `users`.`name`, `photos_external`.`user_id`, `photos_external`.`cnt` FROM `users` - INNER JOIN (SELECT `photos`.`user_id`, COUNT(`photos`.`id`) AS `cnt` FROM `photos` GROUP BY `photos`.`user_id`) AS `photos_aggregation` - ON `users`.`id` = `photos_aggregation`.`user_id` + INNER JOIN (SELECT `photos`.`user_id`, COUNT(`photos`.`id`) AS `cnt` FROM `photos` GROUP BY `photos`.`user_id`) AS `photos_external` + ON `users`.`id` = `photos_external`.`user_id` ") end end @@ -30,10 +39,10 @@ module Arel describe 'with the aggregation on the left' do it 'manufactures sql joining the right table to a derived table' do @aggregation.join(@relation1).on(@predicate).to_sql.should be_like(" - SELECT `photos_aggregation`.`user_id`, `photos_aggregation`.`cnt`, `users`.`id`, `users`.`name` - FROM (SELECT `photos`.`user_id`, COUNT(`photos`.`id`) AS `cnt` FROM `photos` GROUP BY `photos`.`user_id`) AS `photos_aggregation` + SELECT `photos_external`.`user_id`, `photos_external`.`cnt`, `users`.`id`, `users`.`name` + FROM (SELECT `photos`.`user_id`, COUNT(`photos`.`id`) AS `cnt` FROM `photos` GROUP BY `photos`.`user_id`) AS `photos_external` INNER JOIN `users` - ON `users`.`id` = `photos_aggregation`.`user_id` + ON `users`.`id` = `photos_external`.`user_id` ") end end @@ -42,10 +51,10 @@ module Arel it 'it properly aliases the aggregations' do aggregation2 = @aggregation.alias @aggregation.join(aggregation2).on(aggregation2[:user_id].eq(@aggregation[:user_id])).to_sql.should be_like(" - SELECT `photos_aggregation`.`user_id`, `photos_aggregation`.`cnt`, `photos_aggregation_2`.`user_id`, `photos_aggregation_2`.`cnt` - FROM (SELECT `photos`.`user_id`, COUNT(`photos`.`id`) AS `cnt` FROM `photos` GROUP BY `photos`.`user_id`) AS `photos_aggregation` - INNER JOIN (SELECT `photos`.`user_id`, COUNT(`photos`.`id`) AS `cnt` FROM `photos` GROUP BY `photos`.`user_id`) AS `photos_aggregation_2` - ON `photos_aggregation_2`.`user_id` = `photos_aggregation`.`user_id` + SELECT `photos_external`.`user_id`, `photos_external`.`cnt`, `photos_external_2`.`user_id`, `photos_external_2`.`cnt` + FROM (SELECT `photos`.`user_id`, COUNT(`photos`.`id`) AS `cnt` FROM `photos` GROUP BY `photos`.`user_id`) AS `photos_external` + INNER JOIN (SELECT `photos`.`user_id`, COUNT(`photos`.`id`) AS `cnt` FROM `photos` GROUP BY `photos`.`user_id`) AS `photos_external_2` + ON `photos_external_2`.`user_id` = `photos_external`.`user_id` ") end end @@ -54,10 +63,10 @@ module Arel describe 'with the aggregation on the left' do it "manufactures sql keeping wheres on the aggregation within the derived table" do @relation1.join(@aggregation.where(@aggregation[:user_id].eq(1))).on(@predicate).to_sql.should be_like(" - SELECT `users`.`id`, `users`.`name`, `photos_aggregation`.`user_id`, `photos_aggregation`.`cnt` + SELECT `users`.`id`, `users`.`name`, `photos_external`.`user_id`, `photos_external`.`cnt` FROM `users` - INNER JOIN (SELECT `photos`.`user_id`, COUNT(`photos`.`id`) AS `cnt` FROM `photos` WHERE `photos`.`user_id` = 1 GROUP BY `photos`.`user_id`) AS `photos_aggregation` - ON `users`.`id` = `photos_aggregation`.`user_id` + INNER JOIN (SELECT `photos`.`user_id`, COUNT(`photos`.`id`) AS `cnt` FROM `photos` WHERE `photos`.`user_id` = 1 GROUP BY `photos`.`user_id`) AS `photos_external` + ON `users`.`id` = `photos_external`.`user_id` ") end end @@ -65,10 +74,10 @@ module Arel describe 'with the aggregation on the right' do it "manufactures sql keeping wheres on the aggregation within the derived table" do @aggregation.where(@aggregation[:user_id].eq(1)).join(@relation1).on(@predicate).to_sql.should be_like(" - SELECT `photos_aggregation`.`user_id`, `photos_aggregation`.`cnt`, `users`.`id`, `users`.`name` - FROM (SELECT `photos`.`user_id`, COUNT(`photos`.`id`) AS `cnt` FROM `photos` WHERE `photos`.`user_id` = 1 GROUP BY `photos`.`user_id`) AS `photos_aggregation` + SELECT `photos_external`.`user_id`, `photos_external`.`cnt`, `users`.`id`, `users`.`name` + FROM (SELECT `photos`.`user_id`, COUNT(`photos`.`id`) AS `cnt` FROM `photos` WHERE `photos`.`user_id` = 1 GROUP BY `photos`.`user_id`) AS `photos_external` INNER JOIN `users` - ON `users`.`id` = `photos_aggregation`.`user_id` + ON `users`.`id` = `photos_external`.`user_id` ") end end diff --git a/spec/arel/unit/relations/project_spec.rb b/spec/arel/unit/relations/project_spec.rb index 2a4b690dd8..93cbe5668a 100644 --- a/spec/arel/unit/relations/project_spec.rb +++ b/spec/arel/unit/relations/project_spec.rb @@ -57,16 +57,16 @@ module Arel end end - describe '#aggregation?' do + describe '#externalizable?' do describe 'when the projections are attributes' do it 'returns false' do - Project.new(@relation, @attribute).should_not be_aggregation + Project.new(@relation, @attribute).should_not be_externalizable end end describe 'when the projections include an aggregation' do it "obtains" do - Project.new(@relation, @attribute.sum).should be_aggregation + Project.new(@relation, @attribute.sum).should be_externalizable end end end