7.4 KiB
type |
---|
reference |
Object Storage
GitLab supports using an object storage service for holding numerous types of data. It's recommended over NFS and in general it's better in larger setups as object storage is typically much more performant, reliable, and scalable.
Options
Object storage options that GitLab has tested, or is aware of customers using include:
- SaaS/Cloud solutions such as Amazon S3, Google cloud storage.
- On-premises hardware and appliances from various storage vendors.
- MinIO. We have a guide to deploying this within our Helm Chart documentation.
Configuration guides
For configuring GitLab to use Object Storage refer to the following guides:
- Configure object storage for backups.
- Configure object storage for job artifacts including incremental logging.
- Configure object storage for LFS objects.
- Configure object storage for uploads.
- Configure object storage for merge request diffs.
- Configure object storage for Container Registry (optional feature).
- Configure object storage for Mattermost (optional feature).
- Configure object storage for packages (optional feature). (PREMIUM ONLY)
- Configure object storage for Dependency Proxy (optional feature). (PREMIUM ONLY)
- Configure object storage for Pseudonymizer (optional feature). (ULTIMATE ONLY)
- Configure object storage for autoscale Runner caching (optional - for improved performance).
- Configure object storage for Terraform state files
Other alternatives to filesystem storage
If you're working to scale out your GitLab implementation, or add fault tolerance and redundancy, you may be looking at removing dependencies on block or network filesystems. See the following guides and note that Pages requires disk storage:
- Make sure the
git
user home directory is on local disk. - Configure database lookup of SSH keys
to eliminate the need for a shared
authorized_keys
file.
Warnings, limitations, and known issues
Use separate buckets
Using separate buckets for each data type is the recommended approach for GitLab.
A limitation of our configuration is that each use of object storage is separately configured. We have an issue for improving this and easily using one bucket with separate folders is one improvement that this might bring.
There is at least one specific issue with using the same bucket: when GitLab is deployed with the Helm chart restore from backup will not properly function unless separate buckets are used.
One risk of using a single bucket would be that if your organisation decided to migrate GitLab to the Helm deployment in the future. GitLab would run, but the situation with backups might not be realised until the organisation had a critical requirement for the backups to work.
S3 API compatability issues
Not all S3 providers are fully compatible with the Fog library that GitLab uses. Symptoms include:
411 Length Required
GitLab Pages requires NFS
If you're working to add more GitLab servers for scaling or fault tolerance and one of your requirements is GitLab Pages this currently requires NFS. There is work in progress to remove this dependency. In the future, GitLab Pages may use object storage.
The dependency on disk storage also prevents Pages being deployed using the GitLab Helm chart.
Incremental logging is required for CI to use object storage
If you configure GitLab to use object storage for CI logs and artifacts, you must also enable incremental logging.
Proxy Download
A number of the use cases for object storage allow client traffic to be redirected to the object storage back end, like when Git clients request large files via LFS or when downloading CI artifacts and logs.
When the files are stored on local block storage or NFS, GitLab has to act as a proxy. This is not the default behavior with object storage.
The proxy_download
setting controls this behavior: the default is generally false
.
Verify this in the documentation for each use case. Set it to true
so that GitLab proxies
the files.
When not proxying files, GitLab returns an HTTP 302 redirect with a pre-signed, time-limited object storage URL. This can result in some of the following problems:
-
If GitLab is using non-secure HTTP to access the object storage, clients may generate
https->http
downgrade errors and refuse to process the redirect. The solution to this is for GitLab to use HTTPS. LFS, for example, will generate this error:LFS: lfsapi/client: refusing insecure redirect, https->http
-
Clients will need to trust the certificate authority that issued the object storage certificate, or may return common TLS errors such as:
x509: certificate signed by unknown authority
-
Clients will need network access to the object storage. Errors that might result if this access is not in place include:
Received status code 403 from server: Forbidden
Getting a 403 Forbidden
response is specifically called out on the
package repository documentation
as a side effect of how some build tools work.
ETag mismatch
Using the default GitLab settings, some object storage back-ends such as
MinIO
and Alibaba
might generate ETag mismatch
errors.
When using GitLab direct upload, the
workaround for MinIO
is to use the --compat
parameter on the server.
We are working on a fix to GitLab component Workhorse, and also a workaround, in the mean time, to allow ETag verification to be disabled.