172 lines
6.6 KiB
Markdown
172 lines
6.6 KiB
Markdown
|
# Merge Request Performance Guidelines
|
||
|
|
||
|
To ensure a merge request does not negatively impact performance of GitLab
|
||
|
_every_ merge request **must** adhere to the guidelines outlined in this
|
||
|
document. There are no exceptions to this rule unless specifically discussed
|
||
|
with and agreed upon by merge request endbosses and performance specialists.
|
||
|
|
||
|
To measure the impact of a merge request you can use
|
||
|
[Sherlock](profiling.md#sherlock). It's also highly recommended that you read
|
||
|
the following guides:
|
||
|
|
||
|
* [Performance Guidelines](performance.md)
|
||
|
* [What requires downtime?](what_requires_downtime.md)
|
||
|
|
||
|
## Impact Analysis
|
||
|
|
||
|
**Summary:** think about the impact your merge request may have on performance
|
||
|
and those maintaining a GitLab setup.
|
||
|
|
||
|
Any change submitted can have an impact not only on the application itself but
|
||
|
also those maintaining it and those keeping it up and running (e.g. production
|
||
|
engineers). As a result you should think carefully about the impact of your
|
||
|
merge request on not only the application but also on the people keeping it up
|
||
|
and running.
|
||
|
|
||
|
Can the queries used potentially take down any critical services and result in
|
||
|
engineers being woken up in the night? Can a malicious user abuse the code to
|
||
|
take down a GitLab instance? Will my changes simply make loading a certain page
|
||
|
slower? Will execution time grow exponentially given enough load or data in the
|
||
|
database?
|
||
|
|
||
|
These are all questions one should ask themselves before submitting a merge
|
||
|
request. It may sometimes be difficult to assess the impact, in which case you
|
||
|
should ask a performance specialist to review your code. See the "Reviewing"
|
||
|
section below for more information.
|
||
|
|
||
|
## Performance Review
|
||
|
|
||
|
**Summary:** ask performance specialists to review your code if you're not sure
|
||
|
about the impact.
|
||
|
|
||
|
Sometimes it's hard to assess the impact of a merge request. In this case you
|
||
|
should ask one of the merge request (mini) endbosses to review your changes. You
|
||
|
can find a list of these endbosses at <https://about.gitlab.com/team/>. An
|
||
|
endboss in turn can request a performance specialist to review the changes.
|
||
|
|
||
|
## Query Counts
|
||
|
|
||
|
**Summary:** a merge request **should not** increase the number of executed SQL
|
||
|
queries unless absolutely necessary.
|
||
|
|
||
|
The number of queries executed by the code modified or added by a merge request
|
||
|
must not increase unless absolutely necessary. When building features it's
|
||
|
entirely possible you will need some extra queries, but you should try to keep
|
||
|
this at a minimum.
|
||
|
|
||
|
As an example, say you introduce a feature that updates a number of database
|
||
|
rows with the same value. It may be very tempting (and easy) to write this using
|
||
|
the following pseudo code:
|
||
|
|
||
|
```ruby
|
||
|
objects_to_update.each do |object|
|
||
|
object.some_field = some_value
|
||
|
object.save
|
||
|
end
|
||
|
```
|
||
|
|
||
|
This will end up running one query for every object to update. This code can
|
||
|
easily overload a database given enough rows to update or many instances of this
|
||
|
code running in parallel. This particular problem is known as the
|
||
|
["N+1 query problem"](http://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations).
|
||
|
|
||
|
In this particular case the workaround is fairly easy:
|
||
|
|
||
|
```ruby
|
||
|
objects_to_update.update_all(some_field: some_value)
|
||
|
```
|
||
|
|
||
|
This uses ActiveRecord's `update_all` method to update all rows in a single
|
||
|
query. This in turn makes it much harder for this code to overload a database.
|
||
|
|
||
|
## Executing Queries in Loops
|
||
|
|
||
|
**Summary:** SQL queries **must not** be executed in a loop unless absolutely
|
||
|
necessary.
|
||
|
|
||
|
Executing SQL queries in a loop can result in many queries being executed
|
||
|
depending on the number of iterations in a loop. This may work fine for a
|
||
|
development environment with little data, but in a production environment this
|
||
|
can quickly spiral out of control.
|
||
|
|
||
|
There are some cases where this may be needed. If this is the case this should
|
||
|
be clearly mentioned in the merge request description.
|
||
|
|
||
|
## Eager Loading
|
||
|
|
||
|
**Summary:** always eager load associations when retrieving more than one row.
|
||
|
|
||
|
When retrieving multiple database records for which you need to use any
|
||
|
associations you **must** eager load these associations. For example, if you're
|
||
|
retrieving a list of blog posts and you want to display their authors you
|
||
|
**must** eager load the author associations.
|
||
|
|
||
|
In other words, instead of this:
|
||
|
|
||
|
```ruby
|
||
|
Post.all.each do |post|
|
||
|
puts post.author.name
|
||
|
end
|
||
|
```
|
||
|
|
||
|
You should use this:
|
||
|
|
||
|
```ruby
|
||
|
Post.all.includes(:author).each do |post|
|
||
|
puts post.author.name
|
||
|
end
|
||
|
```
|
||
|
|
||
|
## Memory Usage
|
||
|
|
||
|
**Summary:** merge requests **must not** increase memory usage unless absolutely
|
||
|
necessary.
|
||
|
|
||
|
A merge request must not increase the memory usage of GitLab by more than the
|
||
|
absolute bare minimum required by the code. This means that if you have to parse
|
||
|
some large document (e.g. an HTML document) it's best to parse it as a stream
|
||
|
whenever possible, instead of loading the entire input into memory. Sometimes
|
||
|
this isn't possible, in that case this should be stated explicitly in the merge
|
||
|
request.
|
||
|
|
||
|
## Lazy Rendering of UI Elements
|
||
|
|
||
|
**Summary:** only render UI elements when they're actually needed.
|
||
|
|
||
|
Certain UI elements may not always be needed. For example, when hovering over a
|
||
|
diff line there's a small icon displayed that can be used to create a new
|
||
|
comment. Instead of always rendering these kind of elements they should only be
|
||
|
rendered when actually needed. This ensures we don't spend time generating
|
||
|
Haml/HTML when it's not going to be used.
|
||
|
|
||
|
## Instrumenting New Code
|
||
|
|
||
|
**Summary:** always add instrumentation for new classes, modules, and methods.
|
||
|
|
||
|
Newly added classes, modules, and methods must be instrumented. This ensures
|
||
|
we can track the performance of this code over time.
|
||
|
|
||
|
For more information see [Instrumentation](instrumentation.md). This guide
|
||
|
describes how to add instrumentation and where to add it.
|
||
|
|
||
|
## Use of Caching
|
||
|
|
||
|
**Summary:** cache data in memory or in Redis when it's needed multiple times in
|
||
|
a transaction or has to be kept around for a certain time period.
|
||
|
|
||
|
Sometimes certain bits of data have to be re-used in different places during a
|
||
|
transaction. In these cases this data should be cached in memory to remove the
|
||
|
need for running complex operations to fetch the data. You should use Redis if
|
||
|
data should be cached for a certain time period instead of the duration of the
|
||
|
transaction.
|
||
|
|
||
|
For example, say you process multiple snippets of text containiner username
|
||
|
mentions (e.g. `Hello @alice` and `How are you doing @alice?`). By caching the
|
||
|
user objects for every username we can remove the need for running the same
|
||
|
query for every mention of `@alice`.
|
||
|
|
||
|
Caching data per transaction can be done using
|
||
|
[RequestStore](https://github.com/steveklabnik/request_store). Caching data in
|
||
|
Redis can be done using [Rails' caching
|
||
|
system](http://guides.rubyonrails.org/caching_with_rails.html).
|