This updates the composite index on ci_pipelines (project_id, ref,
status) to also include the "id" column at the end. Adding this column
to the index drastically improves the performance of queries used for
getting the latest pipeline for a particular branch. For example, on
project dashboards we'll run a query like the following:
SELECT ci_pipelines.*
FROM ci_pipelines
WHERE ci_pipelines.project_id = 13083
AND ci_pipelines.ref = 'master'
AND ci_pipelines.status = 'success'
ORDER BY ci_pipelines.id DESC
LIMIT 1;
Limit (cost=0.43..58.88 rows=1 width=224) (actual time=26.956..26.956 rows=1 loops=1)
Buffers: shared hit=6544 dirtied=16
-> Index Scan Backward using ci_pipelines_pkey on ci_pipelines (cost=0.43..830922.89 rows=14216 width=224) (actual time=26.954..26.954 rows=1 loops=1)
Filter: ((project_id = 13083) AND ((ref)::text = 'master'::text) AND ((status)::text = 'success'::text))
Rows Removed by Filter: 6476
Buffers: shared hit=6544 dirtied=16
Planning time: 1.484 ms
Execution time: 27.000 ms
Because of the lack of "id" in the index we end up scanning over the
primary key index, then applying a filter to filter out any remaining
rows. The more pipelines a GitLab instance has the slower this will get.
By adding "id" to the mentioned composite index we can change the above
plan into the following:
Limit (cost=0.56..2.01 rows=1 width=224) (actual time=0.034..0.034 rows=1 loops=1)
Buffers: shared hit=5
-> Index Scan Backward using yorick_test on ci_pipelines (cost=0.56..16326.37 rows=11243 width=224) (actual time=0.033..0.033 rows=1 loops=1)
Index Cond: ((project_id = 13083) AND ((ref)::text = 'master'::text) AND ((status)::text = 'success'::text))
Buffers: shared hit=5
Planning time: 0.695 ms
Execution time: 0.061 ms
This in turn leads to a best-case improvement of roughly 25
milliseconds, give or take a millisecond or two.