Arel is a Relational Algebra for Ruby. It 1) simplifies the generation complex of SQL queries and it 2) adapts to various RDBMS systems. It is intended to be a framework framework; that is, you can build your own ORM with it, focusing on innovative object and collection modeling as opposed to database compatibility and query generation.
For the moment, Arel uses ActiveRecord's connection adapters to connect to the various engines, connection pooling, perform quoting, and do type conversion. On the horizon is the use of DataObjects instead.
The long term goal, following both LINQ and DataMapper, is to have Arel adapt to engines beyond RDBMS, including XML, IMAP, YAML, etc.
In fact, you will probably never call `#to_sql`. Rather, you'll work with data from the table directly. You can iterate through all rows in the `users` table like this:
As you can see, Arel converts the rows from the database into a hash, the values of which are sublimated to the appropriate Ruby primitive (integers, strings, and so forth).
users.group(users[:name]) # => SELECT * FROM users GROUP BY name
The best property of the Relational Algebra is its "composability", or closure under all operations. For example, to select AND project, just "chain" the method invocations:
users \
.where(users[:name].eq('amy')) \
.project(users[:id]) \
# => SELECT users.id FROM users WHERE users.name = 'amy'
All operators are chainable in this way, and they are chainable any number of times, in any order.
Where Arel really shines in its ability to handle complex joins and aggregations. As a first example, let's consider an "adjacency list", a tree represented in a table. Suppose we have a table `comments`, representing a threaded discussion:
The `parent_id` column is a foreign key from the `comments` table to itself. Now, joining a table to itself requires aliasing in SQL. In fact, you may alias in Arel as well:
The call to `#alias` is actually optional: Arel will always produce a unique name for every table joined in the relation, and it will always do so deterministically to exploit query caching. Explicit aliasing is more common, however. When you want to extract specific slices of data, aliased tables are a necessity. For example to get just certain columns from the row, treat a row like a hash:
If you don't need to extract the data later (for example, you're simply doing a join to find comments that have replies, you don't care what the content of the replies are), the block form may be preferable:
My personal favorite feature of Arel, certainly the most difficult to implement, and possibly only of marginal value, is **closure under joining even in the presence of aggregations**. This is a feature where the Relational Algebra is fundamentally easier to use than SQL. Think of this as a preview of the kind of radical functionality that is to come, stuff no other "ORM" is doing.
The easiest way to introduce this is in SQL. Your task is to get all users and the **count** of their associated photos. Let's start from the inside out:
Of course, this has a slightly different meaning than our intended query. This is actually a fairly advanced topic in SQL so let's see why this doesn't work *step by step*. Suppose we have these records in our `users` table:
As you can see, we're completely missing data for user with id 3. `dumpty` has no photos, neither does `bai`. But strangely `bai` appeared and `dumpty` didn't! The reason is that the `GROUP BY` clause is aggregating on both tables, not just the `photos` table. All users without photos have a `photos.id` of `null` (thanks to the left outer join). These are rolled up together and an arbitrary user wins. In this case, `bai` not `dumpty`.
SELECT users.*, photos_aggregation.cnt
FROM users
LEFT OUTER JOIN (SELECT user_id, count(*) as cnt FROM photos GROUP BY user_id) AS photos_aggregation