Instantiating attributes hash from raw database values is one of the
slower part of attributes.
Why that is necessary is to detect mutations. In other words, that isn't
necessary until mutations are happened.
`LazyAttributeHash` which was introduced at 0f29c21 is to instantiate
attribute lazily until first accessing the attribute (i.e.
`Model.find(1)` isn't slow yet, but `Model.find(1).attr_name` is still
slow).
This introduces `LazyAttributeSet` to instantiate attribute more lazily,
it doesn't instantiate attribute until first assigning/dirty checking
the attribute (i.e. `Model.find(1).attr_name` is no longer slow).
It makes attributes access about 35% faster for readonly (non-mutation)
usage.
https://gist.github.com/kamipo/4002c96a02859d8fe6503e26d7be4ad8
Before:
```
IPS
Warming up --------------------------------------
attribute access 1.000 i/100ms
Calculating -------------------------------------
attribute access 3.444 (± 0.0%) i/s - 18.000 in 5.259030s
MEMORY
Calculating -------------------------------------
attribute access 38.902M memsize ( 0.000 retained)
350.044k objects ( 0.000 retained)
15.000 strings ( 0.000 retained)
```
After (with `immutable_strings_by_default = true`):
```
IPS
Warming up --------------------------------------
attribute access 1.000 i/100ms
Calculating -------------------------------------
attribute access 4.652 (±21.5%) i/s - 23.000 in 5.034853s
MEMORY
Calculating -------------------------------------
attribute access 27.782M memsize ( 0.000 retained)
170.044k objects ( 0.000 retained)
15.000 strings ( 0.000 retained)
```
Since #31827, marshalling attributes hash format is changed to improve
performance because materializing lazy attribute hash is too expensive.
In that time, we had kept an ability to load from legacy attributes
format, since that performance improvement is backported to 5-1-stable
and 5-0-stable.
Now all supported versions will dump attributes as new format, the
backward compatibity should no longer be needed.
* PERF: Recover marshaling dump/load performance
This performance regression which is described in #30680 was caused by
f0ddf87 due to force materialized `LazyAttributeHash`.
Since 95b86e5, default proc has been removed in the class, so it is no
longer needed that force materialized.
Avoiding force materialized will recover marshaling dump/load
performance.
Benchmark:
https://gist.github.com/blimmer/1360ea51cd3147bae8aeb7c6d09bff17
Before:
```
it took 0.6248569069430232 seconds to unmarshal the objects
Total allocated: 38681544 bytes (530060 objects)
allocated memory by class
-----------------------------------
12138848 Hash
10542384 String
7920000 ActiveModel::Attribute::Uninitialized
5600000 ActiveModel::Attribute::FromDatabase
1200000 Foo
880000 ActiveModel::LazyAttributeHash
400000 ActiveModel::AttributeSet
80 Integer
72 ActiveRecord::ConnectionAdapters::SQLite3Adapter::SQLite3Integer
40 ActiveModel::Type::String
40 ActiveRecord::Type::DateTime
40 Object
40 Range
allocated objects by class
-----------------------------------
250052 String
110000 ActiveModel::Attribute::Uninitialized
70001 Hash
70000 ActiveModel::Attribute::FromDatabase
10000 ActiveModel::AttributeSet
10000 ActiveModel::LazyAttributeHash
10000 Foo
2 Integer
1 ActiveModel::Type::String
1 ActiveRecord::ConnectionAdapters::SQLite3Adapter::SQLite3Integer
1 ActiveRecord::Type::DateTime
1 Object
1 Range
```
After:
```
it took 0.1660824950085953 seconds to unmarshal the objects
Total allocated: 13883811 bytes (220090 objects)
allocated memory by class
-----------------------------------
5743371 String
4940008 Hash
1200000 Foo
880000 ActiveModel::LazyAttributeHash
720000 Array
400000 ActiveModel::AttributeSet
80 ActiveModel::Attribute::FromDatabase
80 Integer
72 ActiveRecord::ConnectionAdapters::SQLite3Adapter::SQLite3Integer
40 ActiveModel::Type::String
40 ActiveModel::Type::Value
40 ActiveRecord::Type::DateTime
40 Object
40 Range
allocated objects by class
-----------------------------------
130077 String
50004 Hash
10000 ActiveModel::AttributeSet
10000 ActiveModel::LazyAttributeHash
10000 Array
10000 Foo
2 Integer
1 ActiveModel::Attribute::FromDatabase
1 ActiveModel::Type::String
1 ActiveModel::Type::Value
1 ActiveRecord::ConnectionAdapters::SQLite3Adapter::SQLite3Integer
1 ActiveRecord::Type::DateTime
1 Object
1 Range
```
Fixes#30680.
* Keep the `@delegate_hash` to avoid to lose any mutations that have been made to the record
There are two concerns which are both being combined into one here, but
both have the same goal. There are certain attributes which we want to
always consider initialized. Previously, they were handled separately.
The primary key (which is assumed to be backed by a database column)
needs to be initialized, because there is a ton of code in Active Record
that assumes `foo.id` will never raise. Additionally, we want attributes
which aren't backed by a database column to always be initialized, since
we would never receive a database value for them.
Ultimately these two concerns can be combined into one. The old
implementation hid a lot of inherent complexity, and is hard to optimize
from the outside. We can simplify things significantly by just passing
in a hash.
This has slightly different semantics from the old behavior, in that
`Foo.select(:bar).first.id` will return the default value for the
primary key, rather than `nil` unconditionally -- however, the default
value is always `nil` in practice.