gitlab-org--gitlab-foss/doc/development/database/client_side_connection_pool.md

---
stage: Data Stores
group: Database
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
---

# Client-side connection-pool

Ruby processes accessing the database through
ActiveRecord, automatically calculate the connection-pool size for the
process based on the concurrency.

Because of the way [Ruby on Rails manages database connections](#connection-lifecycle),
it is important that we have at
least as many connections as we have threads. While there is a 'pool'
setting in [`database.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/database.yml.postgresql), it is not very practical because you need to
maintain it in tandem with the number of application threads. For this
reason, we override the number of allowed connections in the database
connection-pool based on the configured number of application threads.

`Gitlab::Runtime.max_threads` is the number of user-facing
application threads the process has been configured with. We also have
auxiliary threads that use database connections. As it isn't
straightforward to keep an accurate count of the number of auxiliary threads as
the application evolves over time, we just add a fixed headroom to the
number of user-facing threads. It is OK if this number is too large
because connections are instantiated lazily.

## Troubleshooting connection-pool issues

The connection-pool usage can be seen per environment in the
[connection-pool saturation dashboard](https://dashboards.gitlab.net/d/alerts-sat_rails_db_connection_pool/alerts-rails_db_connection_pool-saturation-detail?orgId=1).

If the connection-pool is too small, this would manifest in
`ActiveRecord::ConnectionTimeoutError`s from the application. Because we alert
when almost all connections are used, we should know this before
timeouts occur. If this happens we can remediate by setting the
`DB_POOL_HEADROOM` environment variable to something bigger than the
hardcoded value (10).

At this point, we need to investigate what is using more connections
than we anticipated. To do that, we can use the
`gitlab_ruby_threads_running_threads` metric. For example,
[this graph](https://thanos.gitlab.net/graph?g0.range_input=1h&g0.max_source_resolution=0s&g0.expr=sum%20by%20(thread_name)%20(%20gitlab_ruby_threads_running_threads%7Buses_db_connection%3D%22yes%22%7D%20)&g0.tab=0)
shows all running threads that connect to the database by their
name. Threads labeled `puma worker` or `sidekiq_worker_thread` are
the threads that define `Gitlab::Runtime.max_threads` so those are
accounted for. If there's more than 10 other threads running, we could
consider raising the default headroom.

## Connection lifecycle

For web requests, a connection is obtained from the pool at the first
time a database query is made. The connection is returned to the pool
after the request completes.

For background jobs, the behavior is very similar. The thread obtains
a connection for the first query, and returns it after the job is
finished.

This is managed by Rails internally.
Add latest changes from gitlab-org/gitlab@master 2020-10-07 11:08:40 -04:00			`---`
Add latest changes from gitlab-org/gitlab@master 2022-05-31 11:09:02 -04:00			`stage: Data Stores`
Add latest changes from gitlab-org/gitlab@master 2022-03-30 14:08:19 -04:00			`group: Database`
			`info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments`
Add latest changes from gitlab-org/gitlab@master 2020-10-07 11:08:40 -04:00			`---`

			`# Client-side connection-pool`

			`Ruby processes accessing the database through`
			`ActiveRecord, automatically calculate the connection-pool size for the`
			`process based on the concurrency.`

Add latest changes from gitlab-org/gitlab@master 2022-08-23 20:12:25 -04:00			`Because of the way [Ruby on Rails manages database connections](#connection-lifecycle),`
Add latest changes from gitlab-org/gitlab@master 2022-08-08 11:10:32 -04:00			`it is important that we have at`
Add latest changes from gitlab-org/gitlab@master 2020-10-07 11:08:40 -04:00			`least as many connections as we have threads. While there is a 'pool'`
			setting in [`database.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/database.yml.postgresql), it is not very practical because you need to
			`maintain it in tandem with the number of application threads. For this`
			`reason, we override the number of allowed connections in the database`
			`connection-pool based on the configured number of application threads.`

			`Gitlab::Runtime.max_threads` is the number of user-facing
			`application threads the process has been configured with. We also have`
			`auxiliary threads that use database connections. As it isn't`
			`straightforward to keep an accurate count of the number of auxiliary threads as`
			`the application evolves over time, we just add a fixed headroom to the`
			`number of user-facing threads. It is OK if this number is too large`
			`because connections are instantiated lazily.`

			`## Troubleshooting connection-pool issues`

Add latest changes from gitlab-org/gitlab@master 2022-08-23 20:12:25 -04:00			`The connection-pool usage can be seen per environment in the`
Add latest changes from gitlab-org/gitlab@master 2022-08-08 11:10:32 -04:00			`[connection-pool saturation dashboard](https://dashboards.gitlab.net/d/alerts-sat_rails_db_connection_pool/alerts-rails_db_connection_pool-saturation-detail?orgId=1).`
Add latest changes from gitlab-org/gitlab@master 2020-10-07 11:08:40 -04:00
			`If the connection-pool is too small, this would manifest in`
			`ActiveRecord::ConnectionTimeoutError`s from the application. Because we alert
			`when almost all connections are used, we should know this before`
			`timeouts occur. If this happens we can remediate by setting the`
			`DB_POOL_HEADROOM` environment variable to something bigger than the
			`hardcoded value (10).`

			`At this point, we need to investigate what is using more connections`
			`than we anticipated. To do that, we can use the`
Add latest changes from gitlab-org/gitlab@master 2022-08-08 11:10:32 -04:00			`gitlab_ruby_threads_running_threads` metric. For example,
			`[this graph](https://thanos.gitlab.net/graph?g0.range_input=1h&g0.max_source_resolution=0s&g0.expr=sum%20by%20(thread_name)%20(%20gitlab_ruby_threads_running_threads%7Buses_db_connection%3D%22yes%22%7D%20)&g0.tab=0)`
Add latest changes from gitlab-org/gitlab@master 2020-10-07 11:08:40 -04:00			`shows all running threads that connect to the database by their`
			name. Threads labeled `puma worker` or `sidekiq_worker_thread` are
			the threads that define `Gitlab::Runtime.max_threads` so those are
			`accounted for. If there's more than 10 other threads running, we could`
			`consider raising the default headroom.`

			`## Connection lifecycle`

			`For web requests, a connection is obtained from the pool at the first`
			`time a database query is made. The connection is returned to the pool`
			`after the request completes.`

			`For background jobs, the behavior is very similar. The thread obtains`
			`a connection for the first query, and returns it after the job is`
			`finished.`

			`This is managed by Rails internally.`