Document not storing serialized data
This commit is contained in:
parent
228926daee
commit
3d160480c0
|
@ -49,6 +49,7 @@
|
|||
- [Adding database indexes](adding_database_indexes.md)
|
||||
- [Post Deployment Migrations](post_deployment_migrations.md)
|
||||
- [Foreign Keys & Associations](foreign_keys.md)
|
||||
- [Serializing Data](serializing_data.md)
|
||||
|
||||
## i18n
|
||||
|
||||
|
|
|
@ -0,0 +1,84 @@
|
|||
# Serializing Data
|
||||
|
||||
**Summary:** don't store serialized data in the database, use separate columns
|
||||
and/or tables instead.
|
||||
|
||||
Rails makes it possible to store serialized data in JSON, YAML or other formats.
|
||||
Such a field can be defined as follows:
|
||||
|
||||
```ruby
|
||||
class Issue < ActiveRecord::Model
|
||||
serialize :custom_fields
|
||||
end
|
||||
```
|
||||
|
||||
While it may be tempting to store serialized data in the database there are many
|
||||
problems with this. This document will outline these problems and provide an
|
||||
alternative.
|
||||
|
||||
## Serialized Data Is Less Powerful
|
||||
|
||||
When using a relational database you have the ability to query individual
|
||||
fields, change the schema, index data and so forth. When you use serialized data
|
||||
all of that becomes either very difficult or downright impossible. While
|
||||
PostgreSQL does offer the ability to query JSON fields it is mostly meant for
|
||||
very specialized use cases, and not for more general use. If you use YAML in
|
||||
turn there's no way to query the data at all.
|
||||
|
||||
## Waste Of Space
|
||||
|
||||
Storing serialized data such as JSON or YAML will end up wasting a lot of space.
|
||||
This is because these formats often include additional characters (e.g. double
|
||||
quotes or newlines) besides the data that you are storing.
|
||||
|
||||
## Difficult To Manage
|
||||
|
||||
There comes a time where you will need to add a new field to the serialized
|
||||
data, or change an existing one. Using serialized data this becomes difficult
|
||||
and very time consuming as the only way of doing so is to re-write all the
|
||||
stored values. To do so you would have to:
|
||||
|
||||
1. Retrieve the data
|
||||
1. Parse it into a Ruby structure
|
||||
1. Mutate it
|
||||
1. Serialize it back to a String
|
||||
1. Store it in the database
|
||||
|
||||
On the other hand, if one were to use regular columns adding a column would be
|
||||
as easy as this:
|
||||
|
||||
```sql
|
||||
ALTER TABLE table_name ADD COLUMN column_name type;
|
||||
```
|
||||
|
||||
Such a query would take very little to no time and would immediately apply to
|
||||
all rows, without having to re-write large JSON or YAML structures.
|
||||
|
||||
Finally, there comes a time when the JSON or YAML structure is no longer
|
||||
sufficient and you need to migrate away from it. When storing only a few rows
|
||||
this may not be a problem, but when storing millions of rows such a migration
|
||||
can easily take hours or even days to complete.
|
||||
|
||||
## Relational Databases Are Not Document Stores
|
||||
|
||||
When storing data as JSON or YAML you're essentially using your database as if
|
||||
it were a document store (e.g. MongoDB), except you're not using any of the
|
||||
powerful features provided by a typical RDBMS _nor_ are you using any of the
|
||||
features provided by a typical document store (e.g. the ability to index fields
|
||||
of documents with variable fields). In other words, it's a waste.
|
||||
|
||||
## Consistent Fields
|
||||
|
||||
One argument sometimes made in favour of serialized data is having to store
|
||||
widely varying fields and values. Sometimes this is truly the case, and then
|
||||
perhaps it might make sense to use serialized data. However, in 99% of the cases
|
||||
the fields and types stored tend to be the same for every row. Even if there is
|
||||
a slight difference you can still use separate columns and just not set the ones
|
||||
you don't need.
|
||||
|
||||
## The Solution
|
||||
|
||||
The solution is very simple: just use separate columns and/or separate tables.
|
||||
This will allow you to use all the features provided by your database, it will
|
||||
make it easier to manage and migrate the data, you'll conserve space, you can
|
||||
index the data efficiently and so forth.
|
Loading…
Reference in New Issue