41bfe82b7a
This optimises searching for users when using queries consisting out of one or two characters such as "ab". We optimise such cases by searching for `LOWER(name)` and `LOWER(username)` instead of using `ILIKE`. Using `LOWER` produces a _much_ better performing query. For example, when searching for all users matching the term "a" we'd produce the following plan: Limit (cost=637.69..637.74 rows=20 width=805) (actual time=41.983..41.995 rows=20 loops=1) Buffers: shared hit=8330 -> Sort (cost=637.69..638.61 rows=368 width=805) (actual time=41.982..41.990 rows=20 loops=1) Sort Key: (CASE WHEN ((name)::text = 'a'::text) THEN 0 WHEN ((username)::text = 'a'::text) THEN 1 WHEN ((email)::text = 'a'::text) THEN 2 ELSE 3 END), name Sort Method: top-N heapsort Memory: 35kB Buffers: shared hit=8330 -> Bitmap Heap Scan on users (cost=75.47..627.89 rows=368 width=805) (actual time=9.452..41.305 rows=277 loops=1) Recheck Cond: (((name)::text ~~* 'a'::text) OR ((username)::text ~~* 'a'::text) OR ((email)::text = 'a'::text)) Rows Removed by Index Recheck: 7601 Heap Blocks: exact=7636 Buffers: shared hit=8327 -> BitmapOr (cost=75.47..75.47 rows=368 width=0) (actual time=8.290..8.290 rows=0 loops=1) Buffers: shared hit=691 -> Bitmap Index Scan on index_users_on_name_trigram (cost=0.00..38.85 rows=180 width=0) (actual time=4.369..4.369 rows=4071 loops=1) Index Cond: ((name)::text ~~* 'a'::text) Buffers: shared hit=360 -> Bitmap Index Scan on index_users_on_username_trigram (cost=0.00..34.41 rows=188 width=0) (actual time=3.896..3.896 rows=4140 loops=1) Index Cond: ((username)::text ~~* 'a'::text) Buffers: shared hit=328 -> Bitmap Index Scan on users_email_key (cost=0.00..1.94 rows=1 width=0) (actual time=0.022..0.022 rows=0 loops=1) Index Cond: ((email)::text = 'a'::text) Buffers: shared hit=3 Planning time: 3.912 ms Execution time: 42.171 ms With the changes in this commit we now produce the following plan instead: Limit (cost=13257.48..13257.53 rows=20 width=805) (actual time=1.567..1.579 rows=20 loops=1) Buffers: shared hit=287 -> Sort (cost=13257.48..13280.93 rows=9379 width=805) (actual time=1.567..1.572 rows=20 loops=1) Sort Key: (CASE WHEN ((name)::text = 'a'::text) THEN 0 WHEN ((username)::text = 'a'::text) THEN 1 WHEN ((email)::text = 'a'::text) THEN 2 ELSE 3 END), name Sort Method: top-N heapsort Memory: 35kB Buffers: shared hit=287 -> Bitmap Heap Scan on users (cost=135.66..13007.91 rows=9379 width=805) (actual time=0.194..1.107 rows=277 loops=1) Recheck Cond: ((lower((name)::text) = 'a'::text) OR (lower((username)::text) = 'a'::text) OR ((email)::text = 'a'::text)) Heap Blocks: exact=277 Buffers: shared hit=287 -> BitmapOr (cost=135.66..135.66 rows=9379 width=0) (actual time=0.152..0.152 rows=0 loops=1) Buffers: shared hit=10 -> Bitmap Index Scan on yorick_test_users (cost=0.00..124.75 rows=9377 width=0) (actual time=0.101..0.101 rows=277 loops=1) Index Cond: (lower((name)::text) = 'a'::text) Buffers: shared hit=4 -> Bitmap Index Scan on index_on_users_lower_username (cost=0.00..1.94 rows=1 width=0) (actual time=0.035..0.035 rows=1 loops=1) Index Cond: (lower((username)::text) = 'a'::text) Buffers: shared hit=3 -> Bitmap Index Scan on users_email_key (cost=0.00..1.94 rows=1 width=0) (actual time=0.014..0.014 rows=0 loops=1) Index Cond: ((email)::text = 'a'::text) Buffers: shared hit=3 Planning time: 0.303 ms Execution time: 1.687 ms Here we can see the new query is 25 times faster compared to the old query.
29 lines
845 B
Ruby
29 lines
845 B
Ruby
# See http://doc.gitlab.com/ce/development/migration_style_guide.html
|
|
# for more information on how to write migrations for GitLab.
|
|
|
|
class UsersNameLowerIndex < ActiveRecord::Migration
|
|
include Gitlab::Database::MigrationHelpers
|
|
|
|
# Set this constant to true if this migration requires downtime.
|
|
DOWNTIME = false
|
|
INDEX_NAME = 'index_on_users_name_lower'
|
|
|
|
disable_ddl_transaction!
|
|
|
|
def up
|
|
return unless Gitlab::Database.postgresql?
|
|
|
|
# On GitLab.com this produces an index with a size of roughly 60 MB.
|
|
execute "CREATE INDEX CONCURRENTLY #{INDEX_NAME} ON users (LOWER(name))"
|
|
end
|
|
|
|
def down
|
|
return unless Gitlab::Database.postgresql?
|
|
|
|
if supports_drop_index_concurrently?
|
|
execute "DROP INDEX CONCURRENTLY IF EXISTS #{INDEX_NAME}"
|
|
else
|
|
execute "DROP INDEX IF EXISTS #{INDEX_NAME}"
|
|
end
|
|
end
|
|
end
|