kotovalexarian-likes-gitlab/gitlab-org--gitlab-foss

Author	SHA1	Message	Date
Douwe Maan	7781bda9bd	Move Markdown/reference logic from Gitlab::Markdown to Banzai	2015-12-15 15:51:16 +01:00
Yorick Peterse	094e1cc01b	Align hash literals in IssuesFinder spec	2015-11-19 16:02:21 +01:00
Yorick Peterse	45840426a7	Added benchmark for IssuesFinder	2015-11-19 11:48:50 +01:00
Yorick Peterse	6d91ee0095	Merge branch 'create-project-performance' into 'master' Improve performance of creating projects See merge request !1650	2015-11-04 10:14:30 +00:00
Yorick Peterse	0df65909ef	Added benchmark for User.all This benchmark exists to test if ordering has any noticeable impact in the test environment.	2015-11-03 11:47:23 +01:00
Yorick Peterse	6d3068bec3	Adjusted ips/sec for find_by_any_email benchmarks While these benchmarks run at roughly 1500 i/sec setting the threshold to 1000 leaves some room for deviations (e.g. due to different DB setups).	2015-10-30 12:00:58 +01:00
Yorick Peterse	49c081b9f3	Improve performance of User.find_by_any_email This query used to rely on a JOIN, effectively producing the following SQL: SELECT users.* FROM users LEFT OUTER JOIN emails ON emails.user_id = users.id WHERE (users.email = X OR emails.email = X) LIMIT 1; The use of a JOIN means having to scan over all Emails and users, join them together and then filter out the rows that don't match the criteria (though this step may be taken into account already when joining). In the new setup this query instead uses a sub-query, producing the following SQL: SELECT * FROM users WHERE id IN (select user_id FROM emails WHERE email = X) OR email = X LIMIT 1; This query has the benefit that it: 1. Doesn't have to JOIN any rows 2. Only has to operate on a relatively small set of rows from the "emails" table. Since most users will only have a handful of Emails associated (certainly not hundreds or even thousands) the size of the set returned by the sub-query is small enough that it should not become problematic. Performance of the old versus new version can be measured using the following benchmark: # Save this in ./bench.rb require 'benchmark/ips' email = 'yorick@gitlab.com' def User.find_by_any_email_old(email) user_table = arel_table email_table = Email.arel_table query = user_table. project(user_table[Arel.star]). join(email_table, Arel::Nodes::OuterJoin). on(user_table[:id].eq(email_table[:user_id])). where(user_table[:email].eq(email).or(email_table[:email].eq(email))) find_by_sql(query.to_sql).first end Benchmark.ips do \|bench\| bench.report 'original' do User.find_by_any_email_old(email) end bench.report 'optimized' do User.find_by_any_email(email) end bench.compare! end Running this locally using "bundle exec rails r bench.rb" produces the following output: Calculating ------------------------------------- original 1.000 i/100ms optimized 93.000 i/100ms ------------------------------------------------- original 11.103 (± 0.0%) i/s - 56.000 optimized 948.713 (± 5.3%) i/s - 4.743k Comparison: optimized: 948.7 i/s original: 11.1 i/s - 85.45x slower In other words, the new setup is 85x faster compared to the old setup, at least when running this benchmark locally. For GitLab.com these improvements result in User.find_by_any_email taking only ~170 ms to run, instead of around 800 ms. While this is "only" an improvement of about 4.5 times (instead of 85x) it's still significantly better than before. Fixes #3242	2015-10-30 12:00:58 +01:00
Yorick Peterse	6369c992a6	Added benchmark for Projects::CreateService This benchmark currently runs at ~0.6 iterations per second and is unlikely to perform any better any time soon.	2015-10-29 12:09:25 +01:00
Yorick Peterse	e1c3077e4b	Added benchmark for ReferenceFilter	2015-10-20 15:53:22 +02:00
Yorick Peterse	4ff75e3179	Improve performance of sorting milestone issues This cuts down the time it takes to sort issues of a milestone by about 10x. In the previous setup the code would run a SQL query for every issue that had to be sorted. The new setup instead runs a single SQL query to update all the given issues at once. The attached benchmark used to run at around 60 iterations per second, using the new setup this hovers around 600 iterations per second. Timing wise a request to update a milestone with 40-something issues would take about 760 ms, in the new setup this only takes about 130 ms. Fixes #3066	2015-10-19 11:37:14 +02:00
Yorick Peterse	5ce933599c	Merge branch 'user-by-login-performance' into 'master' Improve User.by_login performance This greatly speeds up the performance of `User.by_login`. I adopted some changes from @haynes in this patch, the credits go to him for coming up with those originally. Fixes #2341 See merge request !1545	2015-10-15 13:41:17 +00:00
Yorick Peterse	3025b71141	Improve ProjectTeam#max_member_access performance By comparing objects in Ruby we can greatly improve the performance of this method. In the worst case (should no data be eager loaded) this will run the same amount of queries as before, in the best case (when data _is_ eager loadeD) it requires no queries at all. The added benchmark used to produce around 273 iterations per second. With this commit this has been increased to almost 40 000 iterations per second: a speedup of roughly 145 times. Combined with eager loading Note associations this results in about 30 queries less when viewing a single issue, this in turn cuts down the loading time by 30-40%.	2015-10-15 12:05:01 +02:00
Yorick Peterse	72f428c7d2	Improve performance of User.by_login Performance is improved in two steps: 1. On PostgreSQL an expression index is used for checking lower(email) and lower(username). 2. The check to determine if we're searching for a username or Email is moved to Ruby. Thanks to @haynes for suggesting and writing the initial implementation of this. Moving the check to Ruby makes this method an additional 1.5 times faster compared to doing the check in the SQL query. With performance being improved I've now also tweaked the amount of iterations required by the User.by_login benchmark. This method now runs between 900 and 1000 iterations per second.	2015-10-15 11:58:25 +02:00
Dmitriy Zaporozhets	0a8f90a040	Merge remote-tracking branch 'public/project-find-with-namespace-performance' Signed-off-by: Dmitriy Zaporozhets <dmitriy.zaporozhets@gmail.com>	2015-10-08 17:42:14 +02:00
Yorick Peterse	03417456f0	Revamp finding projects by namespaces By using a JOIN we can remove the need for using 2 separate queries to find a project by its namespace. Combined with an index (only needed for PostgreSQL) this reduces the query time from ~245 ms (~520 ms for the first call) down to roughly 10 ms (~15 ms for the first call).	2015-10-08 14:35:32 +02:00
Yorick Peterse	7adf9a52ba	Added benchmarks for finding trending projects	2015-10-06 17:26:32 +02:00
Yorick Peterse	22506ddc50	Added benchmark_subject method for benchmarks This class method can be used in "describe" blocks to specify the subject of a benchmark. This lets you write: benchmark_subject { Foo } instead of: benchmark_subject { -> { Foo } }	2015-10-05 10:51:24 +02:00
Yorick Peterse	19893a1c10	Basic setup for an RSpec based benchmark suite This benchmark suite uses benchmark-ips (https://github.com/evanphx/benchmark-ips) behind the scenes. Specs can be turned into benchmark specs by setting "benchmark" to "true" in the top-level describe block like so: describe SomeClass, benchmark: true do end Writing benchmarks can be done using custom RSpec matchers, for example: describe MaruTheCat, benchmark: true do describe '#jump_in_box' do it 'should run 1000 iterations per second' do maru = described_class.new expect { maru.jump_in_box }.to iterate_per_second(1000) end end end By default the "iterate_per_second" expectation requires a standard deviation under 30% (this is just an arbitrary default for now). You can change this by chaining "with_maximum_stddev" on the expectation: expect { maru.jump_in_box }.to iterate_per_second(1000) .with_maximum_stddev(10) This will change the expectation to require a maximum deviation of 10%. Alternatively you can use the it block style to write specs: describe MaruTheCat, benchmark: true do describe '#jump_in_box' do subject { -> { described_class.new } } it { is_expected.to iterate_per_second(1000) } end end Because "iterate_per_second" operates on a block, opposed to a static value, the "subject" method must return a Proc. This looks a bit goofy but I have been unable to find a nice way around this.	2015-10-02 17:00:23 +02:00

18 commits