Merge branch '58738-hashed-storage-document-rollback-mechanism' into 'master'
Hashed Storage: Document Rollback mechanism Closes #58738 See merge request gitlab-org/gitlab-ce!25960
This commit is contained in:
commit
09669e2c3c
|
@ -34,17 +34,59 @@ export ID_FROM=20
|
|||
export ID_TO=50
|
||||
```
|
||||
|
||||
You can monitor the progress in the _Admin > Monitoring > Background jobs_ screen.
|
||||
There is a specific Queue you can watch to see how long it will take to finish: **project_migrate_hashed_storage**
|
||||
You can monitor the progress in the **Admin Area > Monitoring > Background Jobs** page.
|
||||
There is a specific Queue you can watch to see how long it will take to finish:
|
||||
`hashed_storage:hashed_storage_project_migrate`
|
||||
|
||||
After it reaches zero, you can confirm every project has been migrated by running the commands bellow.
|
||||
If you find it necessary, you can run this migration script again to schedule missing projects.
|
||||
|
||||
Any error or warning will be logged in the sidekiq's log file.
|
||||
Any error or warning will be logged in Sidekiq's log file.
|
||||
|
||||
You only need the `gitlab:storage:migrate_to_hashed` rake task to migrate your repositories, but we have additional
|
||||
commands below that helps you inspect projects and attachments in both legacy and hashed storage.
|
||||
|
||||
## Rollback from Hashed storage to Legacy storage
|
||||
|
||||
If you need to rollback the storage migration for any reason, you can follow the steps described here.
|
||||
|
||||
NOTE: **Note:** Hashed Storage will be required in future version of GitLab.
|
||||
|
||||
To prevent new projects from being created in the Hashed storage,
|
||||
you need to undo the [enable hashed storage][storage-migration] changes.
|
||||
|
||||
This task will schedule all your existing projects and associated attachments to be rolled back to the
|
||||
Legacy storage type.
|
||||
|
||||
For Omnibus installations, run the following:
|
||||
|
||||
```bash
|
||||
sudo gitlab-rake gitlab:storage:rollback_to_legacy
|
||||
```
|
||||
|
||||
For source installations, run the following:
|
||||
|
||||
```bash
|
||||
sudo -u git -H bundle exec rake gitlab:storage:rollback_to_legacy RAILS_ENV=production
|
||||
```
|
||||
|
||||
Both commands accept a range as environment variable:
|
||||
|
||||
```bash
|
||||
# to rollback any migrated project from ID 20 to 50.
|
||||
export ID_FROM=20
|
||||
export ID_TO=50
|
||||
```
|
||||
|
||||
You can monitor the progress in the **Admin Area > Monitoring > Background Jobs** page.
|
||||
On the **Queues** tab, you can watch the `hashed_storage:hashed_storage_project_rollback` queue to see how long the process will take to finish.
|
||||
|
||||
|
||||
After it reaches zero, you can confirm every project has been rolled back by running the commands bellow.
|
||||
If some projects weren't rolled back, you can run this rollback script again to schedule further rollbacks.
|
||||
|
||||
Any error or warning will be logged in Sidekiq's log file.
|
||||
|
||||
## List projects on Legacy storage
|
||||
|
||||
To have a simple summary of projects using **Legacy** storage:
|
||||
|
|
|
@ -2,6 +2,24 @@
|
|||
|
||||
> [Introduced][ce-28283] in GitLab 10.0.
|
||||
|
||||
Two different storage layouts can be used
|
||||
to store the repositories on disk and their characteristics.
|
||||
|
||||
GitLab can be configured to use one or multiple repository shard locations
|
||||
that can be:
|
||||
|
||||
- Mounted to the local disk
|
||||
- Exposed as an NFS shared volume
|
||||
- Acessed via [gitaly] on its own machine.
|
||||
|
||||
In GitLab, this is configured in `/etc/gitlab/gitlab.rb` by the `git_data_dirs({})`
|
||||
configuration hash. The storage layouts discussed here will apply to any shard
|
||||
defined in it.
|
||||
|
||||
The `default` repository shard that is available in any installations
|
||||
that haven't customized it, points to the local folder: `/var/opt/gitlab/git-data`.
|
||||
Anything discussed below is expected to be part of that folder.
|
||||
|
||||
## Legacy Storage
|
||||
|
||||
Legacy Storage is the storage behavior prior to version 10.0. For historical
|
||||
|
@ -66,34 +84,7 @@ by another folder with the next 2 characters. They are both stored in a special
|
|||
"@hashed/#{hash[0..1]}/#{hash[2..3]}/#{hash}.wiki.git"
|
||||
```
|
||||
|
||||
### How to migrate to Hashed Storage
|
||||
|
||||
In GitLab, go to **Admin > Settings**, find the **Repository Storage** section
|
||||
and select "_Use hashed storage paths for newly created and renamed projects_".
|
||||
|
||||
To migrate your existing projects to the new storage type, check the specific
|
||||
[rake tasks].
|
||||
|
||||
[ce-28283]: https://gitlab.com/gitlab-org/gitlab-ce/issues/28283
|
||||
[rake tasks]: raketasks/storage.md#migrate-existing-projects-to-hashed-storage
|
||||
[storage-paths]: repository_storage_types.md
|
||||
|
||||
#### Rollback
|
||||
|
||||
There is no automated rollback implemented. Below are the steps required to rollback
|
||||
from each storage migration.
|
||||
|
||||
The rollback has to be performed in the reverse order. To get into "Legacy" state,
|
||||
you need to rollback Attachments first, then Project.
|
||||
|
||||
Also note that if Geo is enabled, after the migration was triggered, an event is generated
|
||||
to replicate the operation on any Secondary node. That means the on disk changes will also
|
||||
need to be performed on these nodes as well. Database changes will propagate without issues.
|
||||
|
||||
You must make sure the migration event was already processed or otherwise it may migrate
|
||||
the files back to Hashed state again.
|
||||
|
||||
#### Hashed object pools
|
||||
### Hashed object pools
|
||||
|
||||
For deduplication of public forks and their parent repository, objects are pooled
|
||||
in an object pool. These object pools are a third repository where shared objects
|
||||
|
@ -110,36 +101,60 @@ enabled for individual projects by executing
|
|||
be on hashed storage, should not be a fork itself, and hashed storage should be
|
||||
enabled for all new projects.
|
||||
|
||||
##### Attachments
|
||||
### How to migrate to Hashed Storage
|
||||
|
||||
To rollback single Attachment migration, rename `aa/bb/abcdef1234567890...` folder back to `namespace/project`.
|
||||
To start a migration, enable Hashed Storage for new projects:
|
||||
|
||||
Both folder names can be generated by the `FileUploader.absolute_base_dir(project)`, you
|
||||
just need to switch the version from the `project` back to the previous one.
|
||||
1. Go to **Admin > Settings** and expand the **Repository Storage** section.
|
||||
2. Select the **Use hashed storage paths for newly created and renamed projects** checkbox.
|
||||
|
||||
```ruby
|
||||
project.storage_version
|
||||
# => 2
|
||||
Check if the change breaks any existing integration you may have that
|
||||
either runs on the same machine as your repositories are located, or may login to that machine
|
||||
to access data (for example, a remote backup solution).
|
||||
|
||||
FileUploader.absolute_base_dir(project)
|
||||
# => "/opt/gitlab/embedded/service/gitlab-rails/public/uploads/@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35"
|
||||
To schedule a complete rollout, see the
|
||||
[rake task documentation for storage migration][rake/migrate-to-hashed] for instructions.
|
||||
|
||||
project.storage_version = 1
|
||||
If you do have any existing integration, you may want to do a small rollout first,
|
||||
to validate. You can do so by specifying a range with the operation.
|
||||
|
||||
FileUploader.absolute_base_dir(project)
|
||||
# => "/opt/gitlab/embedded/service/gitlab-rails/public/uploads/gitlab/gitlab-shell-renamed"
|
||||
This is an example of how to limit the rollout to Project IDs 50 to 100, running in
|
||||
an Omnibus Gitlab installation:
|
||||
|
||||
```bash
|
||||
sudo gitlab-rake gitlab:storage:migrate_to_hashed ID_FROM=50 ID_TO=100
|
||||
```
|
||||
|
||||
##### Project
|
||||
Check the [documentation][rake/migrate-to-hashed] for additional information and instructions for
|
||||
source-based installation.
|
||||
|
||||
To rollback single Project migration, move `@hashed/aa/bb/aabbcdef1234567890abcdef.git` and `@hashed/aa/bb/aabbcdef1234567890abcdef.wiki.git`
|
||||
back to `namespace/project.git` and `namespace/project.wiki.git` respectively and switch the version from the `project` back to `null`.
|
||||
#### Rollback
|
||||
|
||||
Similar to the migration, to disable Hashed Storage for new
|
||||
projects:
|
||||
|
||||
1. Go to **Admin > Settings** and expand the **Repository Storage** section.
|
||||
2. Uncheck the **Use hashed storage paths for newly created and renamed projects** checkbox.
|
||||
|
||||
To schedule a complete rollback, see the
|
||||
[rake task documentation for storage rollback][rake/rollback-to-legacy] for instructions.
|
||||
|
||||
The rollback task also supports specifying a range of Project IDs. Here is an example
|
||||
of limiting the rollout to Project IDs 50 to 100, in an Omnibus Gitlab installation:
|
||||
|
||||
```bash
|
||||
sudo gitlab-rake gitlab:storage:rollback_to_legacy ID_FROM=50 ID_TO=100
|
||||
```
|
||||
|
||||
If you have a Geo setup, please note that the rollback will not be reflected automatically
|
||||
on the **secondary** node. You may need to wait for a backfill operation to kick-in and remove
|
||||
the remaining repositories from the special `@hashed/` folder manually.
|
||||
|
||||
### Hashed Storage coverage
|
||||
|
||||
We are incrementally moving every storable object in GitLab to the Hashed
|
||||
Storage pattern. You can check the current coverage status below (and also see
|
||||
the [issue](https://gitlab.com/gitlab-com/infrastructure/issues/2821)).
|
||||
the [issue][ce-2821]).
|
||||
|
||||
Note that things stored in an S3 compatible endpoint will not have the downsides
|
||||
mentioned earlier, if they are not prefixed with `#{namespace}/#{project_name}`,
|
||||
|
@ -156,6 +171,7 @@ which is true for CI Cache and LFS Objects.
|
|||
| CI Artifacts | No | No | Yes | 9.4 / 10.6 |
|
||||
| CI Cache | No | No | Yes | - |
|
||||
| LFS Objects | Yes | Similar | Yes | 10.0 / 10.7 |
|
||||
| Repository pools| No | Yes | - | 11.6 |
|
||||
|
||||
#### Implementation Details
|
||||
|
||||
|
@ -180,3 +196,10 @@ LFS Objects implements a similar storage pattern using 2 chars, 2 level folders,
|
|||
```
|
||||
|
||||
They are also S3 compatible since **10.0** (GitLab Premium), and available in GitLab Core since **10.7**.
|
||||
|
||||
[ce-2821]: https://gitlab.com/gitlab-com/infrastructure/issues/2821
|
||||
[ce-28283]: https://gitlab.com/gitlab-org/gitlab-ce/issues/28283
|
||||
[rake/migrate-to-hashed]: raketasks/storage.md#migrate-existing-projects-to-hashed-storage
|
||||
[rake/rollback-to-legacy]: raketasks/storage.md#rollback
|
||||
[storage-paths]: repository_storage_types.md
|
||||
[gitaly]: gitaly/index.md
|
||||
|
|
Loading…
Reference in New Issue