Commit graph

4 commits

Author SHA1 Message Date
Stan Hu
d092ed178f Fix background migrations failing with unused replication slot
When there is an unused replication slot, the replication lag function
will return a nil value, resulting in "NoMethodError: undefined method
`>=' for nil:NilClass" error. We now just ignore these nil values.

Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/63666
2019-06-25 08:35:29 -07:00
Nick Thomas
013f7cd24c
Inherit from ApplicationRecord instead of ActiveRecord::Base 2019-03-28 16:18:23 +00:00
Stan Hu
fd7f95ee74 Disable replication lag check for Aurora PostgreSQL databases
Replication slots are not supported in Aurora. Attempting to check
the lag results in the message:

```
ActiveRecord::StatementInvalid: PG::FeatureNotSupported: ERROR:
Replication slots are currently not supported in Aurora : SELECT
pg_xlog_location_diff(pg_current_xlog_insert_location(),
restart_lsn)::...
```

To avoid breaking support for background migrations in Aurora, we just
disable the check if we encounter this error.

This change also now checks whether there are any replication slots
present in the primary before checking the replication lag.

Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/52176
2018-11-03 07:00:31 -07:00
Yorick Peterse
91b752dce6
Respond to DB health in background migrations
This changes the BackgroundMigration worker so it checks for the health
of the DB before performing a background migration. This in turn allows
us to reduce the minimum interval, without having to worry about blowing
things up if we schedule too many migrations.

In this setup, the BackgroundMigration worker will reschedule jobs as
long as the database is considered to be in an unhealthy state. Once the
database has recovered, the migration can be performed.

To determine if the database is in a healthy state, we look at the
replication lag of any replication slots defined on the primary. If the
lag is deemed to great (100 MB by default) for too many slots, the
migration is rescheduled for a later point in time.

The health checking code is hidden behind a feature flag, allowing us to
disable it if necessary.
2018-08-06 15:20:36 +02:00