gitlab-org--gitlab-foss/doc/administration/high_availability/nfs.md

231 lines
10 KiB
Markdown
Raw Normal View History

2016-04-22 14:18:06 -04:00
# NFS
2017-06-18 23:58:23 -04:00
You can view information and options set for each of the mounted NFS file
systems by running `nfsstat -m` and `cat /etc/fstab`.
2017-06-18 23:58:23 -04:00
NOTE: **Note:** Filesystem performance has a big impact on overall GitLab
performance, especially for actions that read or write to Git repositories. See
[Filesystem Performance Benchmarking](../operations/filesystem_benchmarking.md)
for steps to test filesystem performance.
2017-06-18 23:58:23 -04:00
## NFS Server features
### Required features
2016-04-22 14:18:06 -04:00
**File locking**: GitLab **requires** advisory file locking, which is only
supported natively in NFS version 4. NFSv3 also supports locking as long as
2016-04-22 14:18:06 -04:00
Linux Kernel 2.6.5+ is used. We recommend using version 4 and do not
specifically test NFSv3.
2017-06-18 23:58:23 -04:00
### Recommended options
When you define your NFS exports, we recommend you also add the following
options:
- `no_root_squash` - NFS normally changes the `root` user to `nobody`. This is
a good security measure when NFS shares will be accessed by many different
users. However, in this case only GitLab will use the NFS share so it
is safe. GitLab recommends the `no_root_squash` setting because we need to
manage file permissions automatically. Without the setting you may receive
errors when the Omnibus package tries to alter permissions. Note that GitLab
and other bundled components do **not** run as `root` but as non-privileged
users. The recommendation for `no_root_squash` is to allow the Omnibus package
to set ownership and permissions on files, as needed. In some cases where the
`no_root_squash` option is not available, the `root` flag can achieve the same
result.
2017-06-18 23:58:23 -04:00
- `sync` - Force synchronous behavior. Default is asynchronous and under certain
circumstances it could lead to data loss if a failure occurs before data has
synced.
### Improving NFS performance with GitLab
2019-05-29 10:41:31 -04:00
NOTE: **Note:** This is only available starting in certain versions of GitLab: 11.5.11,
11.6.11, 11.7.12, 11.8.8, 11.9.0 and up (e.g. 11.10, 11.11, etc.)
If you are using NFS to share Git data, we recommend that you enable a
number of feature flags that will allow GitLab application processes to
access Git data directly instead of going through the [Gitaly
service](../gitaly/index.md). Depending on your workload and disk
performance, these flags may help improve performance. See [the
issue](https://gitlab.com/gitlab-org/gitlab-ce/issues/57317) for more
details.
To do this, run the Rake task:
```sh
sudo gitlab-rake gitlab:features:enable_rugged
```
If you need to undo this setting for some reason such as switching to [Gitaly without NFS](gitaly.md)
(recommended), run:
```sh
sudo gitlab-rake gitlab:features:disable_rugged
```
### Known issues
On some customer systems, we have seen NFS clients slow precipitously due to
[excessive network traffic from numerous `TEST_STATEID` NFS
messages](https://gitlab.com/gitlab-org/gitlab-ce/issues/52017). This is
likely due to a [Linux kernel
bug](https://bugzilla.redhat.com/show_bug.cgi?id=1552203) that may be fixed in
[more recent kernels with this
commit](https://github.com/torvalds/linux/commit/95da1b3a5aded124dd1bda1e3cdb876184813140).
GitLab recommends all NFS users disable the NFS server
delegation feature. To disable NFS server delegations
on an Linux NFS server, do the following:
1. On the NFS server, run:
```sh
echo 0 > /proc/sys/fs/leases-enable
sysctl -w fs.leases-enable=0
```
1. Restart the NFS server process. For example, on CentOS run `service nfs restart`.
## Avoid using AWS's Elastic File System (EFS)
2017-05-17 15:43:48 -04:00
2017-12-29 12:39:27 -05:00
GitLab strongly recommends against using AWS Elastic File System (EFS).
Our support team will not be able to assist on performance issues related to
file system access.
Customers and users have reported that AWS EFS does not perform well for GitLab's
2018-07-30 15:24:34 -04:00
use-case. Workloads where many small files are written in a serialized manner, like `git`,
are not well-suited for EFS. EBS with an NFS server on top will perform much better.
2018-07-30 15:24:34 -04:00
If you do choose to use EFS, avoid storing GitLab log files (e.g. those in `/var/log/gitlab`)
there because this will also affect performance. We recommend that the log files be
stored on a local volume.
For more details on another person's experience with EFS, see
[Amazon's Elastic File System: Burst Credits](https://rawkode.com/2017/04/16/amazons-elastic-file-system-burst-credits/)
## Avoid using CephFS and GlusterFS
GitLab strongly recommends against using CephFS and GlusterFS.
These distributed file systems are not well-suited for GitLab's input/output access patterns because git uses many small files and access times and file locking times to propagate will make git activity very slow.
## Avoid using PostgreSQL with NFS
GitLab strongly recommends against running your PostgreSQL database
across NFS. The GitLab support team will not be able to assist on performance issues related to
this configuration.
Additionally, this configuration is specifically warned against in the
[Postgres Documentation](https://www.postgresql.org/docs/current/static/creating-cluster.html#CREATING-CLUSTER-NFS):
>PostgreSQL does nothing special for NFS file systems, meaning it assumes NFS behaves exactly like
>locally-connected drives. If the client or server NFS implementation does not provide standard file
>system semantics, this can cause reliability problems. Specifically, delayed (asynchronous) writes
>to the NFS server can cause data corruption problems.
For supported database architecture, please see our documentation on
[Configuring a Database for GitLab HA](database.md).
2017-03-22 19:13:01 -04:00
## NFS Client mount options
2016-04-22 14:18:06 -04:00
2017-06-18 23:58:23 -04:00
Below is an example of an NFS mount point defined in `/etc/fstab` we use on
GitLab.com:
2016-04-22 14:18:06 -04:00
```
10.1.1.1:/var/opt/gitlab/git-data /var/opt/gitlab/git-data nfs4 defaults,soft,rsize=1048576,wsize=1048576,noatime,nofail,lookupcache=positive 0 2
2016-04-22 14:18:06 -04:00
```
Note there are several options that you should consider using:
2016-04-22 14:18:06 -04:00
| Setting | Description |
| ------- | ----------- |
| `vers=4.1` |NFS v4.1 should be used instead of v4.0 because there is a Linux [NFS client bug in v4.0](https://gitlab.com/gitlab-org/gitaly/issues/1339) that can cause significant problems due to stale data.
| `nofail` | Don't halt boot process waiting for this mount to become available
2016-04-22 14:18:06 -04:00
| `lookupcache=positive` | Tells the NFS client to honor `positive` cache results but invalidates any `negative` cache results. Negative cache results cause problems with Git. Specifically, a `git push` can fail to register uniformly across all NFS clients. The negative cache causes the clients to 'remember' that the files did not exist previously.
## A single NFS mount
2016-04-22 14:18:06 -04:00
It's recommended to nest all gitlab data dirs within a mount, that allows automatic
restore of backups without manually moving existing data.
2016-04-22 14:18:06 -04:00
```
mountpoint
└── gitlab-data
├── builds
├── git-data
├── shared
└── uploads
```
2016-04-22 14:18:06 -04:00
To do so, we'll need to configure Omnibus with the paths to each directory nested
in the mount point as follows:
2016-04-22 14:18:06 -04:00
Mount `/gitlab-nfs` then use the following Omnibus
2016-04-22 14:18:06 -04:00
configuration to move each data location to a subdirectory:
```ruby
2018-09-17 16:25:40 -04:00
git_data_dirs({"default" => { "path" => "/gitlab-nfs/gitlab-data/git-data"} })
gitlab_rails['uploads_directory'] = '/gitlab-nfs/gitlab-data/uploads'
gitlab_rails['shared_path'] = '/gitlab-nfs/gitlab-data/shared'
gitlab_ci['builds_directory'] = '/gitlab-nfs/gitlab-data/builds'
2016-04-22 14:18:06 -04:00
```
Run `sudo gitlab-ctl reconfigure` to start using the central location. Please
be aware that if you had existing data you will need to manually copy/rsync it
to these new locations and then restart GitLab.
## Bind mounts
Alternatively to changing the configuration in Omnibus, bind mounts can be used
to store the data on an NFS mount.
2016-04-22 14:18:06 -04:00
Bind mounts provide a way to specify just one NFS mount and then
bind the default GitLab data locations to the NFS mount. Start by defining your
single NFS mount point as you normally would in `/etc/fstab`. Let's assume your
NFS mount point is `/gitlab-nfs`. Then, add the following bind mounts in
2016-04-22 14:18:06 -04:00
`/etc/fstab`:
```bash
/gitlab-nfs/gitlab-data/git-data /var/opt/gitlab/git-data none bind 0 0
/gitlab-nfs/gitlab-data/.ssh /var/opt/gitlab/.ssh none bind 0 0
/gitlab-nfs/gitlab-data/uploads /var/opt/gitlab/gitlab-rails/uploads none bind 0 0
/gitlab-nfs/gitlab-data/shared /var/opt/gitlab/gitlab-rails/shared none bind 0 0
/gitlab-nfs/gitlab-data/builds /var/opt/gitlab/gitlab-ci/builds none bind 0 0
2016-04-22 14:18:06 -04:00
```
Using bind mounts will require manually making sure the data directories
are empty before attempting a restore. Read more about the
[restore prerequisites](../../raketasks/backup_restore.md).
## Multiple NFS mounts
When using default Omnibus configuration you will need to share 4 data locations
between all GitLab cluster nodes. No other locations should be shared. The
following are the 4 locations need to be shared:
| Location | Description | Default configuration |
| -------- | ----------- | --------------------- |
2018-09-17 16:25:40 -04:00
| `/var/opt/gitlab/git-data` | Git repository data. This will account for a large portion of your data | `git_data_dirs({"default" => { "path" => "/var/opt/gitlab/git-data"} })`
| `/var/opt/gitlab/gitlab-rails/uploads` | User uploaded attachments | `gitlab_rails['uploads_directory'] = '/var/opt/gitlab/gitlab-rails/uploads'`
| `/var/opt/gitlab/gitlab-rails/shared` | Build artifacts, GitLab Pages, LFS objects, temp files, etc. If you're using LFS this may also account for a large portion of your data | `gitlab_rails['shared_path'] = '/var/opt/gitlab/gitlab-rails/shared'`
| `/var/opt/gitlab/gitlab-ci/builds` | GitLab CI build traces | `gitlab_ci['builds_directory'] = '/var/opt/gitlab/gitlab-ci/builds'`
Other GitLab directories should not be shared between nodes. They contain
node-specific files and GitLab code that does not need to be shared. To ship
logs to a central location consider using remote syslog. GitLab Omnibus packages
provide configuration for [UDP log shipping][udp-log-shipping].
Having multiple NFS mounts will require manually making sure the data directories
are empty before attempting a restore. Read more about the
[restore prerequisites](../../raketasks/backup_restore.md).
2016-04-22 14:18:06 -04:00
---
Read more on high-availability configuration:
1. [Configure the database](database.md)
1. [Configure Redis](redis.md)
1. [Configure the GitLab application servers](gitlab.md)
1. [Configure the load balancers](load_balancer.md)
[udp-log-shipping]: http://docs.gitlab.com/omnibus/settings/logs.html#udp-log-shipping-gitlab-enterprise-edition-only "UDP log shipping"