gitlab-org--gitlab-foss/doc/development/migration_style_guide.md

# Migration Style Guide

When writing migrations for GitLab, you have to take into account that
these will be ran by hundreds of thousands of organizations of all sizes, some with
many years of data in their database.

In addition, having to take a server offline for a an upgrade small or big is
a big burden for most organizations. For this reason it is important that your
migrations are written carefully, can be applied online and adhere to the style guide below.

Migrations should not require GitLab installations to be taken offline unless
_absolutely_ necessary - see the ["What Requires Downtime?"](what_requires_downtime.md)
page. If a migration requires downtime, this should be clearly mentioned during
the review process, as well as being documented in the monthly release post. For
more information, see the "Downtime Tagging" section below.

When writing your migrations, also consider that databases might have stale data
or inconsistencies and guard for that. Try to make as little assumptions as possible
about the state of the database.

Please don't depend on GitLab specific code since it can change in future versions.
If needed copy-paste GitLab code into the migration to make it forward compatible.

## Downtime Tagging

Every migration must specify if it requires downtime or not, and if it should
require downtime it must also specify a reason for this. To do so, add the
following two constants to the migration class' body:

* `DOWNTIME`: a boolean that when set to `true` indicates the migration requires
  downtime.
* `DOWNTIME_REASON`: a String containing the reason for the migration requiring
  downtime. This constant **must** be set when `DOWNTIME` is set to `true`.

For example:

```ruby
class MyMigration < ActiveRecord::Migration
  DOWNTIME = true
  DOWNTIME_REASON = 'This migration requires downtime because ...'

  def change
    ...
  end
end
```

It is an error (that is, CI will fail) if the `DOWNTIME` constant is missing
from a migration class.

## Reversibility

Your migration should be reversible. This is very important, as it should
be possible to downgrade in case of a vulnerability or bugs.

In your migration, add a comment describing how the reversibility of the
migration was tested.

## Removing indices

If you need to remove index, please add a condition like in following example:

```ruby
remove_index :namespaces, column: :name if index_exists?(:namespaces, :name)
```

## Adding indices

If you need to add an unique index please keep in mind there is possibility of existing duplicates. If it is possible write a separate migration for handling this situation. It can be just removing or removing with overwriting all references to these duplicates depend on situation.

When adding an index make sure to use the method `add_concurrent_index` instead
of the regular `add_index` method. The `add_concurrent_index` method
automatically creates concurrent indexes when using PostgreSQL, removing the
need for downtime. To use this method you must disable transactions by calling
the method `disable_ddl_transaction!` in the body of your migration class like
so:

```ruby
class MyMigration < ActiveRecord::Migration
  include Gitlab::Database::MigrationHelpers
  disable_ddl_transaction!

  def change

  end
end
```

## Adding Columns With Default Values

When adding columns with default values you should use the method
`add_column_with_default`. This method ensures the table is updated without
requiring downtime. This method is not reversible so you must manually define
the `up` and `down` methods in your migration class.

For example, to add the column `foo` to the `projects` table with a default
value of `10` you'd write the following:

```ruby
class MyMigration < ActiveRecord::Migration
  include Gitlab::Database::MigrationHelpers
  disable_ddl_transaction!

  def up
    add_column_with_default(:projects, :foo, :integer, default: 10)
  end

  def down
    remove_column(:projects, :foo)
  end
end
```


## Integer column type

By default, an integer column can hold up to a 4-byte (32-bit) number. That is
a max value of 2,147,483,647. Be aware of this when creating a column that will
hold file sizes in byte units. If you are tracking file size in bytes this
restricts the maximum file size to just over 2GB.

To allow an integer column to hold up to an 8-byte (64-bit) number, explicitly
set the limit to 8-bytes. This will allow the column to hold a value up to
9,223,372,036,854,775,807.

Rails migration example:

```ruby
add_column_with_default(:projects, :foo, :integer, default: 10, limit: 8)

# or

add_column(:projects, :foo, :integer, default: 10, limit: 8)
```

## Testing

Make sure that your migration works with MySQL and PostgreSQL with data. An empty database does not guarantee that your migration is correct.

Make sure your migration can be reversed.

## Data migration

Please prefer Arel and plain SQL over usual ActiveRecord syntax. In case of using plain SQL you need to quote all input manually with `quote_string` helper.

Example with Arel:

```ruby
users = Arel::Table.new(:users)
users.group(users[:user_id]).having(users[:id].count.gt(5))

#update other tables with these results
```

Example with plain SQL and `quote_string` helper:

```ruby
select_all("SELECT name, COUNT(id) as cnt FROM tags GROUP BY name HAVING COUNT(id) > 1").each do |tag|
  tag_name = quote_string(tag["name"])
  duplicate_ids = select_all("SELECT id FROM tags WHERE name = '#{tag_name}'").map{|tag| tag["id"]}
  origin_tag_id = duplicate_ids.first
  duplicate_ids.delete origin_tag_id

  execute("UPDATE taggings SET tag_id = #{origin_tag_id} WHERE tag_id IN(#{duplicate_ids.join(",")})")
  execute("DELETE FROM tags WHERE id IN(#{duplicate_ids.join(",")})")
end
```
create migration style guide. Fixes #2305 2015-05-11 13:09:36 +00:00			`# Migration Style Guide`

			`When writing migrations for GitLab, you have to take into account that`
Small improvements to style guide. 2015-05-11 14:48:39 +00:00			`these will be ran by hundreds of thousands of organizations of all sizes, some with`
create migration style guide. Fixes #2305 2015-05-11 13:09:36 +00:00			`many years of data in their database.`

			`In addition, having to take a server offline for a an upgrade small or big is`
			`a big burden for most organizations. For this reason it is important that your`
Small improvements to style guide. 2015-05-11 14:48:39 +00:00			`migrations are written carefully, can be applied online and adhere to the style guide below.`
create migration style guide. Fixes #2305 2015-05-11 13:09:36 +00:00
Updated migration styleguide for new helpers 2016-05-09 13:05:19 +00:00			`Migrations should not require GitLab installations to be taken offline unless`
Link to the "What requires downtime?" page from the Migration Style Guide [ci skip] 2016-10-02 11:32:30 +00:00			`_absolutely_ necessary - see the ["What Requires Downtime?"](what_requires_downtime.md)`
			`page. If a migration requires downtime, this should be clearly mentioned during`
			`the review process, as well as being documented in the monthly release post. For`
			`more information, see the "Downtime Tagging" section below.`
Add notice about offline migrations 2015-11-02 15:14:34 +00:00
create migration style guide. Fixes #2305 2015-05-11 13:09:36 +00:00			`When writing your migrations, also consider that databases might have stale data`
			`or inconsistencies and guard for that. Try to make as little assumptions as possible`
			`about the state of the database.`

Small improvements to style guide. 2015-05-11 14:48:39 +00:00			`Please don't depend on GitLab specific code since it can change in future versions.`
Fix a typo [ci skip] Remove extra `make` 2016-02-12 13:32:35 +00:00			`If needed copy-paste GitLab code into the migration to make it forward compatible.`
Small improvements to style guide. 2015-05-11 14:48:39 +00:00
Added checks for migration downtime These new checks can be used to check if migrations require downtime or not (as tagged by their authors). In CI this compares the current branch with master so migrations added by merge requests are automatically verified. To check the migrations added since a Git reference simply run: bundle exec rake gitlab:db:downtime_check[GIT_REF] 2016-06-24 16:29:23 +00:00			`## Downtime Tagging`
create migration style guide. Fixes #2305 2015-05-11 13:09:36 +00:00
Added checks for migration downtime These new checks can be used to check if migrations require downtime or not (as tagged by their authors). In CI this compares the current branch with master so migrations added by merge requests are automatically verified. To check the migrations added since a Git reference simply run: bundle exec rake gitlab:db:downtime_check[GIT_REF] 2016-06-24 16:29:23 +00:00			`Every migration must specify if it requires downtime or not, and if it should`
			`require downtime it must also specify a reason for this. To do so, add the`
			`following two constants to the migration class' body:`
create migration style guide. Fixes #2305 2015-05-11 13:09:36 +00:00
Added checks for migration downtime These new checks can be used to check if migrations require downtime or not (as tagged by their authors). In CI this compares the current branch with master so migrations added by merge requests are automatically verified. To check the migrations added since a Git reference simply run: bundle exec rake gitlab:db:downtime_check[GIT_REF] 2016-06-24 16:29:23 +00:00			* `DOWNTIME`: a boolean that when set to `true` indicates the migration requires
			`downtime.`
			* `DOWNTIME_REASON`: a String containing the reason for the migration requiring
			downtime. This constant must be set when `DOWNTIME` is set to `true`.
create migration style guide. Fixes #2305 2015-05-11 13:09:36 +00:00
Added checks for migration downtime These new checks can be used to check if migrations require downtime or not (as tagged by their authors). In CI this compares the current branch with master so migrations added by merge requests are automatically verified. To check the migrations added since a Git reference simply run: bundle exec rake gitlab:db:downtime_check[GIT_REF] 2016-06-24 16:29:23 +00:00			`For example:`
create migration style guide. Fixes #2305 2015-05-11 13:09:36 +00:00
Added checks for migration downtime These new checks can be used to check if migrations require downtime or not (as tagged by their authors). In CI this compares the current branch with master so migrations added by merge requests are automatically verified. To check the migrations added since a Git reference simply run: bundle exec rake gitlab:db:downtime_check[GIT_REF] 2016-06-24 16:29:23 +00:00			```ruby
Update migration_style_guide.md with new details 2016-06-15 21:38:12 +00:00			`class MyMigration < ActiveRecord::Migration`
Added checks for migration downtime These new checks can be used to check if migrations require downtime or not (as tagged by their authors). In CI this compares the current branch with master so migrations added by merge requests are automatically verified. To check the migrations added since a Git reference simply run: bundle exec rake gitlab:db:downtime_check[GIT_REF] 2016-06-24 16:29:23 +00:00			`DOWNTIME = true`
			`DOWNTIME_REASON = 'This migration requires downtime because ...'`
Update migration_style_guide.md with new details 2016-06-15 21:38:12 +00:00
Added checks for migration downtime These new checks can be used to check if migrations require downtime or not (as tagged by their authors). In CI this compares the current branch with master so migrations added by merge requests are automatically verified. To check the migrations added since a Git reference simply run: bundle exec rake gitlab:db:downtime_check[GIT_REF] 2016-06-24 16:29:23 +00:00			`def change`
			`...`
			`end`
			`end`
			```
create migration style guide. Fixes #2305 2015-05-11 13:09:36 +00:00
Added checks for migration downtime These new checks can be used to check if migrations require downtime or not (as tagged by their authors). In CI this compares the current branch with master so migrations added by merge requests are automatically verified. To check the migrations added since a Git reference simply run: bundle exec rake gitlab:db:downtime_check[GIT_REF] 2016-06-24 16:29:23 +00:00			It is an error (that is, CI will fail) if the `DOWNTIME` constant is missing
			`from a migration class.`
Add notice about offline migrations 2015-11-02 15:14:34 +00:00
Added checks for migration downtime These new checks can be used to check if migrations require downtime or not (as tagged by their authors). In CI this compares the current branch with master so migrations added by merge requests are automatically verified. To check the migrations added since a Git reference simply run: bundle exec rake gitlab:db:downtime_check[GIT_REF] 2016-06-24 16:29:23 +00:00			`## Reversibility`
create migration style guide. Fixes #2305 2015-05-11 13:09:36 +00:00
			`Your migration should be reversible. This is very important, as it should`
			`be possible to downgrade in case of a vulnerability or bugs.`

			`In your migration, add a comment describing how the reversibility of the`
update migration style guide 2015-05-12 08:48:18 +00:00			`migration was tested.`

			`## Removing indices`

			`If you need to remove index, please add a condition like in following example:`

Add more highlighting to Migration Style Guide doc [ci skip] 2016-11-09 10:59:15 +00:00			```ruby
update migration style guide 2015-05-12 08:48:18 +00:00			`remove_index :namespaces, column: :name if index_exists?(:namespaces, :name)`
			```

			`## Adding indices`

			`If you need to add an unique index please keep in mind there is possibility of existing duplicates. If it is possible write a separate migration for handling this situation. It can be just removing or removing with overwriting all references to these duplicates depend on situation.`

Updated migration styleguide for new helpers 2016-05-09 13:05:19 +00:00			When adding an index make sure to use the method `add_concurrent_index` instead
			of the regular `add_index` method. The `add_concurrent_index` method
			`automatically creates concurrent indexes when using PostgreSQL, removing the`
			`need for downtime. To use this method you must disable transactions by calling`
			the method `disable_ddl_transaction!` in the body of your migration class like
			`so:`

Add more highlighting to Migration Style Guide doc [ci skip] 2016-11-09 10:59:15 +00:00			```ruby
Updated migration styleguide for new helpers 2016-05-09 13:05:19 +00:00			`class MyMigration < ActiveRecord::Migration`
Update migration_style_guide.md with new details 2016-06-15 21:38:12 +00:00			`include Gitlab::Database::MigrationHelpers`
Updated migration styleguide for new helpers 2016-05-09 13:05:19 +00:00			`disable_ddl_transaction!`

			`def change`

			`end`
			`end`
			```

			`## Adding Columns With Default Values`

			`When adding columns with default values you should use the method`
			`add_column_with_default`. This method ensures the table is updated without
			`requiring downtime. This method is not reversible so you must manually define`
			the `up` and `down` methods in your migration class.

			For example, to add the column `foo` to the `projects` table with a default
			value of `10` you'd write the following:

Add more highlighting to Migration Style Guide doc [ci skip] 2016-11-09 10:59:15 +00:00			```ruby
Updated migration styleguide for new helpers 2016-05-09 13:05:19 +00:00			`class MyMigration < ActiveRecord::Migration`
Update migration_style_guide.md with new details 2016-06-15 21:38:12 +00:00			`include Gitlab::Database::MigrationHelpers`
			`disable_ddl_transaction!`
Added checks for migration downtime These new checks can be used to check if migrations require downtime or not (as tagged by their authors). In CI this compares the current branch with master so migrations added by merge requests are automatically verified. To check the migrations added since a Git reference simply run: bundle exec rake gitlab:db:downtime_check[GIT_REF] 2016-06-24 16:29:23 +00:00
Updated migration styleguide for new helpers 2016-05-09 13:05:19 +00:00			`def up`
Update migration_style_guide.md with new details 2016-06-15 21:38:12 +00:00			`add_column_with_default(:projects, :foo, :integer, default: 10)`
Updated migration styleguide for new helpers 2016-05-09 13:05:19 +00:00			`end`

			`def down`
			`remove_column(:projects, :foo)`
			`end`
			`end`
			```

Add support for column limits in add_column_with_default 2016-09-13 22:15:14 +00:00
			`## Integer column type`

			`By default, an integer column can hold up to a 4-byte (32-bit) number. That is`
			`a max value of 2,147,483,647. Be aware of this when creating a column that will`
			`hold file sizes in byte units. If you are tracking file size in bytes this`
			`restricts the maximum file size to just over 2GB.`

			`To allow an integer column to hold up to an 8-byte (64-bit) number, explicitly`
			`set the limit to 8-bytes. This will allow the column to hold a value up to`
			`9,223,372,036,854,775,807.`

			`Rails migration example:`

Add more highlighting to Migration Style Guide doc [ci skip] 2016-11-09 10:59:15 +00:00			```ruby
Add support for column limits in add_column_with_default 2016-09-13 22:15:14 +00:00			`add_column_with_default(:projects, :foo, :integer, default: 10, limit: 8)`

			`# or`

			`add_column(:projects, :foo, :integer, default: 10, limit: 8)`
			```

update migration style guide 2015-05-12 08:48:18 +00:00			`## Testing`

			`Make sure that your migration works with MySQL and PostgreSQL with data. An empty database does not guarantee that your migration is correct.`

			`Make sure your migration can be reversed.`

			`## Data migration`

			Please prefer Arel and plain SQL over usual ActiveRecord syntax. In case of using plain SQL you need to quote all input manually with `quote_string` helper.

			`Example with Arel:`

Add more highlighting to Migration Style Guide doc [ci skip] 2016-11-09 10:59:15 +00:00			```ruby
update migration style guide 2015-05-12 08:48:18 +00:00			`users = Arel::Table.new(:users)`
			`users.group(users[:user_id]).having(users[:id].count.gt(5))`

updtae -> update 2016-05-30 05:31:39 +00:00			`#update other tables with these results`
update migration style guide 2015-05-12 08:48:18 +00:00			```

			Example with plain SQL and `quote_string` helper:

Add more highlighting to Migration Style Guide doc [ci skip] 2016-11-09 10:59:15 +00:00			```ruby
update migration style guide 2015-05-12 08:48:18 +00:00			`select_all("SELECT name, COUNT(id) as cnt FROM tags GROUP BY name HAVING COUNT(id) > 1").each do \|tag\|`
			`tag_name = quote_string(tag["name"])`
			`duplicate_ids = select_all("SELECT id FROM tags WHERE name = '#{tag_name}'").map{\|tag\| tag["id"]}`
			`origin_tag_id = duplicate_ids.first`
			`duplicate_ids.delete origin_tag_id`

			`execute("UPDATE taggings SET tag_id = #{origin_tag_id} WHERE tag_id IN(#{duplicate_ids.join(",")})")`
			`execute("DELETE FROM tags WHERE id IN(#{duplicate_ids.join(",")})")`
			`end`
Updated migration styleguide for new helpers 2016-05-09 13:05:19 +00:00			```