It's finally finished!!!!!!! The reason the Attributes API was kept
private in 4.2 was due to some publicly visible implementation details.
It was previously implemented by overloading `columns` and
`columns_hash`, to make them return column objects which were modified
with the attribute information.
This meant that those methods LIED! We didn't change the database
schema. We changed the attribute information on the class. That is
wrong! It should be the other way around, where schema loading just
calls the attributes API for you. And now it does!
Yes, this means that there is nothing that happens in automatic schema
loading that you couldn't manually do yourself. (There's still some
funky cases where we hit the connection adapter that I need to handle,
before we can turn off automatic schema detection entirely.)
There were a few weird test failures caused by this that had to be
fixed. The main source came from the fact that the attribute methods are
now defined in terms of `attribute_names`, which has a clause like
`return [] unless table_exists?`. I don't *think* this is an issue,
since the only place this caused failures were in a fake adapter which
didn't override `table_exists?`.
Additionally, there were a few cases where tests were failing because a
migration was run, but the model was not reloaded. I'm not sure why
these started failing from this change, I might need to clear an
additional cache in `reload_schema_from_cache`. Again, since this is not
normal usage, and it's expected that `reset_column_information` will be
called after the table is modified, I don't think it's a problem.
Still, test failures that were unrelated to the change are worrying, and
I need to dig into them further.
Finally, I spent a lot of time debugging issues with the mutex used in
`define_attribute_methods`. I think we can just remove that method
entirely, and define the attribute methods *manually* in the call to
`define_attribute`, which would simplify the code *tremendously*.
Ok. now to make this damn thing public, and work on moving it up to
Active Model.
`bound_attributes` is now used universally across the board, removing
the need for the conversion layer. These changes are mostly mechanical,
with the exception of the log subscriber. Additional, we had to
implement `hash` on the attribute objects, so they could be used as a
key for query caching.
This method can be used to see all of the fields on a model which have
been read. This can be useful during development mode to quickly find
out which fields need to be selected. For performance critical pages, if
you are not using all of the fields of a database, an easy performance
win is only selecting the fields which you need. By calling this method
at the end of a controller action, it's easy to determine which fields
need to be selected.
While writing this, I also noticed a place for an easy performance win
internally which I had been wanting to introduce. You cannot mutate a
field which you have not read. Therefore, we can skip the calculation of
in place changes if we have never read from the field. This can
significantly speed up methods like `#changed?` if any of the fields
have an expensive mutable type (like `serialize`)
```
Calculating -------------------------------------
#changed? with serialized column (before)
391.000 i/100ms
#changed? with serialized column (after)
1.514k i/100ms
-------------------------------------------------
#changed? with serialized column (before)
4.243k (± 3.7%) i/s - 21.505k
#changed? with serialized column (after)
16.789k (± 3.2%) i/s - 84.784k
```
While we don't want to change the form input when validations fail,
blindly using `_before_type_cast` will cause the input to display the
wrong data for any type which does additional work on database values.
In the case of serialized columns, we would expect the unserialized
value as input, not the serialized value. The original issue which made
this distinction, #14163, introduced a bug. If you passed serialized
input to the method, it would double serialize when it was sent to the
database. You would see the wrong input upon reloading, or get an error
if you had a specific type on the serialized column.
To put it another way, `update_column` is a special case of
`update_all`, which would take `['a']` and not `['a'].to_yaml`, but you
would not pass data from `params` to it.
Fixes#18037
Making this change revealed several subtle bugs related to models with
no primary key, and anonymous classes. These have been fixed as well,
with regression tests added.
This will make it less painful to add additional properties, which
should persist across writes, such as `name`.
Conflicts:
activerecord/lib/active_record/attribute_set.rb
Moved `Builder` to its own file, as it started looking very weird once I
added private methods to the `AttributeSet` class and the `Builder`
class started to grow.
Would like to refactor `fetch_value` to change to
```ruby
self[name].value(&block)
```
But that requires the attributes to know about their name, which they
currently do not.
Adding `# :nodoc:` to the parent `class` / `module` is not going
to ignore nested classes or modules.
There is a modifier `# :nodoc: all` but sadly the containing class
or module will continue to be in the docs.
/cc @sgrif
There's a lot more that can be moved to these, but this felt like a good
place to introduce the object. Plans are:
- Remove all knowledge of type casting from the columns, beyond a
reference to the cast_type
- Move type_cast_for_database to these objects
- Potentially make them mutable, introduce a state machine, and have
dirty checking handled here as well
- Move `attribute`, `decorate_attribute`, and anything else that
modifies types to mess with this object, not the columns hash
- Introduce a collection object to manage these, reduce allocations, and
not require serializing the types