Prior to this change, health checks checked for writeability of the NFS
shards. Given we're moving away from that, this patch extends the checks
for Gitaly to check for read and writeability.
Potentially some dashboards will break, as over time these metrics will
no longer appear as Prometheus doesn't get the data anymore.
Observability in the circuit breaker will be reduced, but its not
expected to be turned on and the circuit breaker is being removed soon
too.
Closes https://gitlab.com/gitlab-org/gitaly/issues/1218
Moving the check out of the general requests, makes sure we don't have
any slowdown in the regular requests.
To keep the process performing this checks small, the check is still
performed inside a unicorn. But that is called from a process running
on the same server.
Because the checks are now done outside normal request, we can have a
simpler failure strategy:
The check is now performed in the background every
`circuitbreaker_check_interval`. Failures are logged in redis. The
failures are reset when the check succeeds. Per check we will try
`circuitbreaker_access_retries` times within
`circuitbreaker_storage_timeout` seconds.
When the number of failures exceeds
`circuitbreaker_failure_count_threshold`, we will block access to the
storage.
After `failure_reset_time` of no checks, we will clear the stored
failures. This could happen when the process that performs the checks
is not running.