When an `updated_at` column exists on the model, but is not available on the instance (likely due to a select), we should raise an error rather than silently not generating a cache_version. Without this behavior it's likely that cache entries will not be able to be invalidated and this will happen without notice.
This behavior was reported and described by @lsylvester in https://github.com/rails/rails/pull/34197#issuecomment-429668759.
Currently, the `updated_at` field is used to generate a `cache_version`. Some database adapters return this timestamp value as a string that must then be converted to a Time value. This process requires a lot of memory and even more CPU time. In the case where this value is only being used for a cache version, we can skip the Time conversion by using the string value directly.
- This PR preserves existing cache format by converting a UTC string from the database to `:usec` format.
- Some databases return an already converted Time object, in those instances, we can directly use `created_at`.
- The `updated_at_before_type_cast` can be a value that comes from either the database or the user. We only want to optimize the case where it is from the database.
- If the format of the cache version has been changed, we cannot apply this optimization, and it is skipped.
- If the format of the time in the database is not UTC, then we cannot use this optimization, and it is skipped.
Some databases (notably PostgreSQL) returns a variable length nanosecond value in the time string. If the value ends in a zero, then it is truncated For instance instead of `2018-10-12 05:00:00.000000` the value `2018-10-12 05:00:00` is returned. We detect this case and pad the remaining zeros to ensure consistent cache version generation.
Before: Total allocated: 743842 bytes (6626 objects)
After: Total allocated: 702955 bytes (6063 objects)
(743842 - 702955) / 743842.0 # => 5.4% ⚡️⚡️⚡️⚡️⚡️
Using the CodeTriage application and derailed benchmarks this PR shows between 9-11% (statistically significant) performance improvement versus the commit before it.
Special thanks to @lsylvester for helping to figure out a way to preserve the usec format and for helping with many implementation details.
The default timestamp used for AR is `updated_at` in nanoseconds! (:nsec) This causes issues on any machine that runs an OS that supports nanoseconds timestamps, i.e. not-OS X, where the cache_key of the record persisted in the database (milliseconds precision) is out-of-sync with the cache_key in the ruby VM.
This commit adds:
A test that shows the issue, it can be found in the separate file `cache_key_test.rb`, because
- model couldn't be defined inline
- transactional testing needed to be turned off to get it to pass the MySQL tests
This seemed cleaner than putting it in an existing testcase file.
It adds :usec as a dateformat that calculates datetime in microseconds
It sets precision of cache_key to :usec instead of :nsec, as no db supports nsec precision on timestamps