gitlab-org--gitlab-foss/spec
Yorick Peterse 49c081b9f3 Improve performance of User.find_by_any_email
This query used to rely on a JOIN, effectively producing the following
SQL:

    SELECT users.*
    FROM users
    LEFT OUTER JOIN emails ON emails.user_id = users.id
    WHERE (users.email = X OR emails.email = X)
    LIMIT 1;

The use of a JOIN means having to scan over all Emails and users, join
them together and then filter out the rows that don't match the criteria
(though this step may be taken into account already when joining).

In the new setup this query instead uses a sub-query, producing the
following SQL:

    SELECT *
    FROM users
    WHERE id IN (select user_id FROM emails WHERE email = X)
    OR email = X
    LIMIT 1;

This query has the benefit that it:

1. Doesn't have to JOIN any rows
2. Only has to operate on a relatively small set of rows from the
   "emails" table.

Since most users will only have a handful of Emails associated
(certainly not hundreds or even thousands) the size of the set returned
by the sub-query is small enough that it should not become problematic.

Performance of the old versus new version can be measured using the
following benchmark:

    # Save this in ./bench.rb
    require 'benchmark/ips'

    email = 'yorick@gitlab.com'

    def User.find_by_any_email_old(email)
      user_table = arel_table
      email_table = Email.arel_table

      query = user_table.
        project(user_table[Arel.star]).
        join(email_table, Arel::Nodes::OuterJoin).
        on(user_table[:id].eq(email_table[:user_id])).
        where(user_table[:email].eq(email).or(email_table[:email].eq(email)))

      find_by_sql(query.to_sql).first
    end

    Benchmark.ips do |bench|
      bench.report 'original' do
        User.find_by_any_email_old(email)
      end

      bench.report 'optimized' do
        User.find_by_any_email(email)
      end

      bench.compare!
    end

Running this locally using "bundle exec rails r bench.rb" produces the
following output:

    Calculating -------------------------------------
                original     1.000  i/100ms
               optimized    93.000  i/100ms
    -------------------------------------------------
                original     11.103  (± 0.0%) i/s -     56.000
               optimized    948.713  (± 5.3%) i/s -      4.743k

    Comparison:
               optimized:      948.7 i/s
                original:       11.1 i/s - 85.45x slower

In other words, the new setup is 85x faster compared to the old setup,
at least when running this benchmark locally.

For GitLab.com these improvements result in User.find_by_any_email
taking only ~170 ms to run, instead of around 800 ms. While this is
"only" an improvement of about 4.5 times (instead of 85x) it's still
significantly better than before.

Fixes #3242
2015-10-30 12:00:58 +01:00
..
benchmarks Improve performance of User.find_by_any_email 2015-10-30 12:00:58 +01:00
controllers Merge branch 'project-path-case-sensitivity' into 'master' 2015-10-22 13:03:04 +00:00
factories Implement Commit Status API 2015-10-12 11:53:49 +02:00
features Remove deprecated CI events from project settings page 2015-10-28 12:33:54 +01:00
finders Merge remote-tracking branch 'public/trending-projects-performance' 2015-10-08 16:22:43 +02:00
fixtures No HTML-only email please 2015-08-21 16:09:55 -07:00
helpers Merge branch 'cross-reference-mr-on-issues' into 'master' 2015-10-18 12:07:28 +00:00
javascripts Apply new design to files page 2015-10-13 16:41:48 +02:00
lib Merge branch 'dirceu/gitlab-ce-fix-project-search-with-unmatched-parentheses' 2015-10-25 11:55:14 +01:00
mailers Merge branch 'stanhu/gitlab-ce-fix-message-id-notify' 2015-10-01 16:28:10 +02:00
models Fix: Inability to reply to code comments in the MR view, if the MR comes from a fork 2015-10-22 18:38:00 +02:00
requests Merge pull request #9762 from huacnlee/fix/api-helpers-bad-autoload-name-for-master 2015-10-22 21:12:35 -07:00
routing Move partial to right place and fix tests. 2015-09-08 15:14:14 +01:00
services Fix CI badge 2015-10-26 12:23:40 +01:00
support Merge branch 'master' into rs-redactor-filter 2015-10-16 11:26:48 +02:00
tasks/gitlab Fix rubocop warnings in spec/lib and spec/tasks 2015-10-03 16:02:21 -05:00
views/help Allow non-admin users to see version information 2015-09-23 17:18:15 -04:00
workers Remove RepositoryArchiveWorker specs 2015-10-14 12:19:02 +02:00
factories.rb Only publish ssh key-type and key 2015-08-04 14:33:18 +02:00
factories_spec.rb Remove the invalid key factories 2015-04-11 17:12:10 -04:00
rails_helper.rb Started on the actual rspec 3 upgrade 2015-06-22 12:12:49 +02:00
spec_helper.rb Merge branch 'refactor-build-service' into 'master' 2015-10-05 17:42:50 +00:00
teaspoon_env.rb teaspoon install 2015-05-28 18:22:32 -04:00