We can skip the allocation of a full `AttributeSet` by changing the
semantics of how we structure things. Instead of comparing two separate
`AttributeSet` objects, and `Attribute` is now a singly linked list of
every change that has happened to it. Since the attribute objects are
immutable, to apply the changes we simply need to copy the head of the
list.
It's worth noting that this causes one subtle change in the behavior of
AR. When a record is saved successfully, the `before_type_cast` version
of everything will be what was sent to the database. I honestly think
these semantics make more sense, as we could have just as easily had the
DB do `RETURNING *` and updated the record with those if we had things
like timestamps implemented at the DB layer.
This brings our performance closer to 4.2, but we're still not quite
there.
This moves a bit more of the logic required for dirty checking into the
attribute objects. I had hoped to remove the `with_value_from_database`
stuff, but unfortunately just calling `dup` on the attribute objects
isn't enough, since the values might contain deeply nested data
structures. I think this can be cleaned up further.
This makes most dirty checking become lazy, and reduces the number of
object allocations and amount of CPU time when assigning a value. This
opens the door (but doesn't quite finish) to improving the performance
of writes to a place comparable to 4.1
This method can be used to see all of the fields on a model which have
been read. This can be useful during development mode to quickly find
out which fields need to be selected. For performance critical pages, if
you are not using all of the fields of a database, an easy performance
win is only selecting the fields which you need. By calling this method
at the end of a controller action, it's easy to determine which fields
need to be selected.
While writing this, I also noticed a place for an easy performance win
internally which I had been wanting to introduce. You cannot mutate a
field which you have not read. Therefore, we can skip the calculation of
in place changes if we have never read from the field. This can
significantly speed up methods like `#changed?` if any of the fields
have an expensive mutable type (like `serialize`)
```
Calculating -------------------------------------
#changed? with serialized column (before)
391.000 i/100ms
#changed? with serialized column (after)
1.514k i/100ms
-------------------------------------------------
#changed? with serialized column (before)
4.243k (± 3.7%) i/s - 21.505k
#changed? with serialized column (after)
16.789k (± 3.2%) i/s - 84.784k
```
Before this commit, returning `false` in an ActiveRecord `before_` callback
such as `before_create` would halt the callback chain.
After this commit, the behavior is deprecated: will still work until
the next release of Rails but will also display a deprecation warning.
The preferred way to halt a callback chain is to explicitly `throw(:abort)`.
This will make it less painful to add additional properties, which
should persist across writes, such as `name`.
Conflicts:
activerecord/lib/active_record/attribute_set.rb
There's a lot more that can be moved to these, but this felt like a good
place to introduce the object. Plans are:
- Remove all knowledge of type casting from the columns, beyond a
reference to the cast_type
- Move type_cast_for_database to these objects
- Potentially make them mutable, introduce a state machine, and have
dirty checking handled here as well
- Move `attribute`, `decorate_attribute`, and anything else that
modifies types to mess with this object, not the columns hash
- Introduce a collection object to manage these, reduce allocations, and
not require serializing the types