Detect and keep track of flaky specs See merge request !13021
22 KiB
Testing Standards and Style Guidelines
This guide outlines standards and best practices for automated testing of GitLab CE and EE.
It is meant to be an extension of the thoughtbot testing styleguide. If this guide defines a rule that contradicts the thoughtbot guide, this guide takes precedence. Some guidelines may be repeated verbatim to stress their importance.
Definitions
Unit tests
Formal definition: https://en.wikipedia.org/wiki/Unit_testing
These kind of tests ensure that a single unit of code (a method) works as expected (given an input, it has a predictable output). These tests should be isolated as much as possible. For example, model methods that don't do anything with the database shouldn't need a DB record. Classes that don't need database records should use stubs/doubles as much as possible.
Code path | Tests path | Testing engine | Notes |
---|---|---|---|
app/finders/ |
spec/finders/ |
RSpec | |
app/helpers/ |
spec/helpers/ |
RSpec | |
app/db/{post_,}migrate/ |
spec/migrations/ |
RSpec | More details at spec/migrations/README.md . |
app/policies/ |
spec/policies/ |
RSpec | |
app/presenters/ |
spec/presenters/ |
RSpec | |
app/routing/ |
spec/routing/ |
RSpec | |
app/serializers/ |
spec/serializers/ |
RSpec | |
app/services/ |
spec/services/ |
RSpec | |
app/tasks/ |
spec/tasks/ |
RSpec | |
app/uploaders/ |
spec/uploaders/ |
RSpec | |
app/views/ |
spec/views/ |
RSpec | |
app/workers/ |
spec/workers/ |
RSpec | |
app/assets/javascripts/ |
spec/javascripts/ |
Karma | More details in the JavaScript section. |
Integration tests
Formal definition: https://en.wikipedia.org/wiki/Integration_testing
These kind of tests ensure that individual parts of the application work well together, without the overhead of the actual app environment (i.e. the browser). These tests should assert at the request/response level: status code, headers, body. They're useful to test permissions, redirections, what view is rendered etc.
Code path | Tests path | Testing engine | Notes |
---|---|---|---|
app/controllers/ |
spec/controllers/ |
RSpec | |
app/mailers/ |
spec/mailers/ |
RSpec | |
lib/api/ |
spec/requests/api/ |
RSpec | |
lib/ci/api/ |
spec/requests/ci/api/ |
RSpec | |
app/assets/javascripts/ |
spec/javascripts/ |
Karma | More details in the JavaScript section. |
About controller tests
In an ideal world, controllers should be thin. However, when this is not the case, it's acceptable to write a system/feature test without JavaScript instead of a controller test. The reason is that testing a fat controller usually involves a lot of stubbing, things like:
controller.instance_variable_set(:@user, user)
and use methods which are deprecated in Rails 5 (#23768).
About Karma
As you may have noticed, Karma is both in the Unit tests and the Integration tests category. That's because Karma is a tool that provides an environment to run JavaScript tests, so you can either run unit tests (e.g. test a single JavaScript method), or integration tests (e.g. test a component that is composed of multiple components).
System tests or Feature tests
Formal definition: https://en.wikipedia.org/wiki/System_testing.
These kind of tests ensure the application works as expected from a user point of view (aka black-box testing). These tests should test a happy path for a given page or set of pages, and a test case should be added for any regression that couldn't have been caught at lower levels with better tests (i.e. if a regression is found, regression tests should be added at the lowest-level possible).
Tests path | Testing engine | Notes |
---|---|---|
spec/features/ |
Capybara + RSpec | If your spec has the :js metadata, the browser driver will be Poltergeist, otherwise it's using RackTest. |
features/ |
Spinach | Spinach tests are deprecated, you shouldn't add new Spinach tests. |
Best practices
- Create only the necessary records in the database
- Test a happy path and a less happy path but that's it
- Every other possible path should be tested with Unit or Integration tests
- Test what's displayed on the page, not the internals of ActiveRecord models.
For instance, if you want to verify that a record was created, add
expectations that its attributes are displayed on the page, not that
Model.count
increased by one. - It's ok to look for DOM elements but don't abuse it since it makes the tests more brittle
If we're confident that the low-level components work well (and we should be if we have enough Unit & Integration tests), we shouldn't need to duplicate their thorough testing at the System test level.
It's very easy to add tests, but a lot harder to remove or improve tests, so one should take care of not introducing too many (slow and duplicated) specs.
The reasons why we should follow these best practices are as follows:
- System tests are slow to run since they spin up the entire application stack in a headless browser, and even slower when they integrate a JS driver
- When system tests run with a JavaScript driver, the tests are run in a different thread than the application. This means it does not share a database connection and your test will have to commit the transactions in order for the running application to see the data (and vice-versa). In that case we need to truncate the database after each spec instead of simply rolling back a transaction (the faster strategy that's in use for other kind of tests). This is slower than transactions, however, so we want to use truncation only when necessary.
Black-box tests or End-to-end tests
GitLab consists of multiple pieces such as GitLab Shell, GitLab Workhorse, Gitaly, GitLab Pages, GitLab Runner, and GitLab Rails. All theses pieces are configured and packaged by GitLab Omnibus.
GitLab QA is a tool that allows to test that all these pieces integrate well together by building a Docker image for a given version of GitLab Rails and running feature tests (i.e. using Capybara) against it.
The actual test scenarios and steps are part of GitLab Rails so that they're always in-sync with the codebase.
How to test at the correct level?
As many things in life, deciding what to test at each level of testing is a trade-off:
- Unit tests are usually cheap, and you should consider them like the basement of your house: you need them to be confident that your code is behaving correctly. However if you run only unit tests without integration / system tests, you might miss the big picture!
- Integration tests are a bit more expensive, but don't abuse them. A system test is often better than an integration test that is stubbing a lot of internals.
- System tests are expensive (compared to unit tests), even more if they require a JavaScript driver. Make sure to follow the guidelines in the Speed section.
Another way to see it is to think about the "cost of tests", this is well explained in this article and the basic idea is that the cost of a test includes:
- The time it takes to write the test
- The time it takes to run the test every time the suite runs
- The time it takes to understand the test
- The time it takes to fix the test if it breaks and the underlying code is OK
- Maybe, the time it takes to change the code to make the code testable.
Frontend testing
Please consult the dedicated "Frontend testing" guide.
RSpec
General Guidelines
- Use a single, top-level
describe ClassName
block. - Use
.method
to describe class methods and#method
to describe instance methods. - Use
context
to test branching logic. - Don't assert against the absolute value of a sequence-generated attribute (see Gotchas).
- Try to match the ordering of tests to the ordering within the class.
- Try to follow the Four-Phase Test pattern, using newlines to separate phases.
- Use
Gitlab.config.gitlab.host
rather than hard coding'localhost'
- Don't assert against the absolute value of a sequence-generated attribute (see Gotchas).
- Don't supply the
:each
argument to hooks since it's the default. - On
before
andafter
hooks, prefer it scoped to:context
over:all
Automatic retries and flaky tests detection
On our CI, we use rspec-retry to automatically retry a failing example a few
times (see spec/spec_helper.rb
for the precise retries count).
We also use a home-made RspecFlaky::Listener
listener which records flaky
examples in a JSON report file on master
(retrieve-tests-metadata
and update-tests-metadata
jobs), and warns when a new flaky example
is detected in any other branch (flaky-examples-check
job). In the future, the
flaky-examples-check
job will not be allowed to fail.
let
variables
GitLab's RSpec suite has made extensive use of let
variables to reduce
duplication. However, this sometimes comes at the cost of clarity,
so we need to set some guidelines for their use going forward:
let
variables are preferable to instance variables. Local variables are preferable tolet
variables.- Use
let
to reduce duplication throughout an entire spec file. - Don't use
let
to define variables used by a single test; define them as local variables inside the test'sit
block. - Don't define a
let
variable inside the top-leveldescribe
block that's only used in a more deeply-nestedcontext
ordescribe
block. Keep the definition as close as possible to where it's used. - Try to avoid overriding the definition of one
let
variable with another. - Don't define a
let
variable that's only used by the definition of another. Use a helper method instead.
set
variables
In some cases there is no need to recreate the same object for tests again for
each example. For example, a project is needed to test issues on the same
project, one project will do for the entire file. This can be achieved by using
set
in the same way you would use let
.
rspec-set
only works on ActiveRecord objects, and before new examples it
reloads or recreates the model, only if needed. That is, when you changed
properties or destroyed the object.
There is one gotcha; you can't reference a model defined in a let
block in a
set
block.
Time-sensitive tests
Timecop is available in our Ruby-based tests for verifying things that are time-sensitive. Any test that exercises or verifies something time-sensitive should make use of Timecop to prevent transient test failures.
Example:
it 'is overdue' do
issue = build(:issue, due_date: Date.tomorrow)
Timecop.freeze(3.days.from_now) do
expect(issue).to be_overdue
end
end
System / Feature tests
- Feature specs should be named
ROLE_ACTION_spec.rb
, such asuser_changes_password_spec.rb
. - Use only one
feature
block per feature spec file. - Use scenario titles that describe the success and failure paths.
- Avoid scenario titles that add no information, such as "successfully".
- Avoid scenario titles that repeat the feature title.
Matchers
Custom matchers should be created to clarify the intent and/or hide the
complexity of RSpec expectations.They should be placed under
spec/support/matchers/
. Matchers can be placed in subfolder if they apply to
a certain type of specs only (e.g. features, requests etc.) but shouldn't be if
they apply to multiple type of specs.
have_gitlab_http_status
Prefer have_gitlab_http_status
over have_http_status
because the former
could also show the response body whenever the status mismatched. This would
be very useful whenever some tests start breaking and we would love to know
why without editing the source and rerun the tests.
This is especially useful whenever it's showing 500 internal server error.
Shared contexts
All shared contexts should be be placed under spec/support/shared_contexts/
.
Shared contexts can be placed in subfolder if they apply to a certain type of
specs only (e.g. features, requests etc.) but shouldn't be if they apply to
multiple type of specs.
Each file should include only one context and have a descriptive name, e.g.
spec/support/shared_contexts/controllers/githubish_import_controller_shared_context.rb
.
Shared examples
All shared examples should be be placed under spec/support/shared_examples/
.
Shared examples can be placed in subfolder if they apply to a certain type of
specs only (e.g. features, requests etc.) but shouldn't be if they apply to
multiple type of specs.
Each file should include only one context and have a descriptive name, e.g.
spec/support/shared_examples/controllers/githubish_import_controller_shared_example.rb
.
Helpers
Helpers are usually modules that provide some methods to hide the complexity of
specific RSpec examples. You can define helpers in RSpec files if they're not
intended to be shared with other specs. Otherwise, they should be be placed
under spec/support/helpers/
. Helpers can be placed in subfolder if they apply
to a certain type of specs only (e.g. features, requests etc.) but shouldn't be
if they apply to multiple type of specs.
Helpers should follow the Rails naming / namespacing convention. For instance
spec/support/helpers/cycle_analytics_helpers.rb
should define:
module Spec
module Support
module Helpers
module CycleAnalyticsHelpers
def create_commit_referencing_issue(issue, branch_name: random_git_name)
project.repository.add_branch(user, branch_name, 'master')
create_commit("Commit for ##{issue.iid}", issue.project, user, branch_name)
end
end
end
end
end
Helpers should not change the RSpec config. For instance, the helpers module described above should not include:
RSpec.configure do |config|
config.include Spec::Support::Helpers::CycleAnalyticsHelpers
end
Factories
GitLab uses factory_girl as a test fixture replacement.
- Factory definitions live in
spec/factories/
, named using the pluralization of their corresponding model (User
factories are defined inusers.rb
). - There should be only one top-level factory definition per file.
- FactoryGirl methods are mixed in to all RSpec groups. This means you can (and
should) call
create(...)
instead ofFactoryGirl.create(...)
. - Make use of traits to clean up definitions and usages.
- When defining a factory, don't define attributes that are not required for the resulting record to pass validation.
- When instantiating from a factory, don't supply attributes that aren't required by the test.
- Factories don't have to be limited to
ActiveRecord
objects. See example.
Fixtures
All fixtures should be be placed under spec/fixtures/
.
Config
RSpec config files are files that change the RSpec config (i.e.
RSpec.configure do |config|
blocks). They should be placed under
spec/support/config/
.
Each file should be related to a specific domain, e.g.
spec/support/config/capybara.rb
, spec/support/config/carrierwave.rb
, etc.
Helpers can be included in the spec/support/config/rspec.rb
file. If a
helpers module applies only to a certain kind of specs, it should add modifiers
to the config.include
call. For instance if
spec/support/helpers/cycle_analytics_helpers.rb
applies to :lib
and
type: :model
specs only, you would write the following:
RSpec.configure do |config|
config.include Spec::Support::Helpers::CycleAnalyticsHelpers, :lib
config.include Spec::Support::Helpers::CycleAnalyticsHelpers, type: :model
end
Testing Rake Tasks
To make testing Rake tasks a little easier, there is a helper that can be included
in lieu of the standard Spec helper. Instead of require 'spec_helper'
, use
require 'rake_helper'
. The helper includes spec_helper
for you, and configures
a few other things to make testing Rake tasks easier.
At a minimum, requiring the Rake helper will redirect stdout
, include the
runtime task helpers, and include the RakeHelpers
Spec support module.
The RakeHelpers
module exposes a run_rake_task(<task>)
method to make
executing tasks simple. See spec/support/rake_helpers.rb
for all available
methods.
Example:
require 'rake_helper'
describe 'gitlab:shell rake tasks' do
before do
Rake.application.rake_require 'tasks/gitlab/shell'
stub_warn_user_is_not_gitlab
end
describe 'install task' do
it 'invokes create_hooks task' do
expect(Rake::Task['gitlab:shell:create_hooks']).to receive(:invoke)
run_rake_task('gitlab:shell:install')
end
end
end
Test speed
GitLab has a massive test suite that, without parallelization, can take hours to run. It's important that we make an effort to write tests that are accurate and effective as well as fast.
Here are some things to keep in mind regarding test performance:
double
andspy
are faster thanFactoryGirl.build(...)
FactoryGirl.build(...)
and.build_stubbed
are faster than.create
.- Don't
create
an object whenbuild
,build_stubbed
,attributes_for
,spy
, ordouble
will do. Database persistence is slow! - Don't mark a feature as requiring JavaScript (through
@javascript
in Spinach or:js
in RSpec) unless it's actually required for the test to be valid. Headless browser testing is slow!
Test suite parallelization on the CI
Our current CI parallelization setup is as follows:
- The
knapsack
job in the prepare stage that is supposed to ensure we have aknapsack/${CI_PROJECT_NAME}/rspec_report-master.json
file:
- The
knapsack/${CI_PROJECT_NAME}/rspec_report-master.json
file is fetched from S3, if it's not here we initialize the file with{}
.
- Each
rspec x y
job are run withknapsack rspec
and should have an evenly distributed share of tests:
- It works because the jobs have access to the
knapsack/${CI_PROJECT_NAME}/rspec_report-master.json
since the "artifacts from all previous stages are passed by default". 1 - the jobs set their own report path to
KNAPSACK_REPORT_PATH=knapsack/${CI_PROJECT_NAME}/${JOB_NAME[0]}_node_${CI_NODE_INDEX}_${CI_NODE_TOTAL}_report.json
. - if knapsack is doing its job, test files that are run should be listed under
Report specs
, not underLeftover specs
.
- The
update-knapsack
job takes all theknapsack/${CI_PROJECT_NAME}/${JOB_NAME[0]}_node_${CI_NODE_INDEX}_${CI_NODE_TOTAL}_report.json
files from therspec x y
jobs and merge them all together into a singleknapsack/${CI_PROJECT_NAME}/rspec_report-master.json
file that is then uploaded to S3.
After that, the next pipeline will use the up-to-date
knapsack/${CI_PROJECT_NAME}/rspec_report-master.json
file. The same strategy
is used for Spinach tests as well.
Monitoring
The GitLab test suite is monitored for the master
branch, and any branch
that includes rspec-profile
in their name.
A public dashboard is available for everyone to see. Feel free to look at the slowest test files and try to improve them.
CI setup
- On CE, the test suite only runs against PostgreSQL by default. We additionally
run the suite against MySQL for tags,
master
, and any branch that includesmysql
in the name. - On EE, the test suite always runs both PostgreSQL and MySQL.
- Rails logging to
log/test.log
is disabled by default in CI for performance reasons. To override this setting, provide theRAILS_ENABLE_TEST_LOG
environment variable.
Spinach (feature) tests
GitLab moved from Cucumber to Spinach for its feature/integration tests in September 2012.
As of March 2016, we are trying to avoid adding new Spinach tests going forward, opting for RSpec feature specs.
Adding new Spinach scenarios is acceptable only if the new scenario requires
no more than one new step
definition. If more than that is required, the
test should be re-implemented using RSpec instead.
Return to Development documentation
-
/ci/yaml/README.html#dependencies ↩︎