kotovalexarian-likes-gitlab/gitlab-org--gitlab-foss

GitLab Bot 7510df057e Add latest changes from gitlab-org/gitlab@master

2022-09-30 18:08:31 +00:00

8.7 KiB

Raw Blame History

stage	group	info
none	unassigned	To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments

Flaky tests

What's a flaky test?

It's a test that sometimes fails, but if you retry it enough times, it passes, eventually.

Quarantined tests

When a test frequently fails in main, create a ~"failure::flaky-test" issue.

If the test cannot be fixed in a timely fashion, there is an impact on the productivity of all the developers, so it should be quarantined by assigning the :quarantine metadata with the issue URL, and add the ~"quarantined test" label to the issue.

it 'succeeds', quarantine: 'https://gitlab.com/gitlab-org/gitlab/-/issues/12345' do
  expect(response).to have_gitlab_http_status(:ok)
end

This means it is skipped unless run with --tag quarantine:

bin/rspec --tag quarantine

Once a test is in quarantine, there are 3 choices:

Fix the test (that is, get rid of its flakiness).
Move the test to a lower level of testing.
Remove the test entirely (for example, because there's already a lower-level test, or it's duplicating another same-level test, or it's testing too much etc.).

Automatic retries and flaky tests detection

On our CI, we use RSpec::Retry to automatically retry a failing example a few times (see spec/spec_helper.rb for the precise retries count).

We also use a custom RspecFlaky::Listener. This listener runs in the update-tests-metadata job in maintenance scheduled pipelines on the master branch, and saves flaky examples to rspec/flaky/report-suite.json. The report file is then retrieved by the retrieve-tests-metadata job in all pipelines.

This was originally implemented in: https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/13021.

If you want to enable retries locally, you can use the RETRIES environment variable. For instance RETRIES=1 bin/rspec ... would retry the failing examples once.

To generate the reports locally, use the FLAKY_RSPEC_GENERATE_REPORT environment variable. For example, FLAKY_RSPEC_GENERATE_REPORT=1 bin/rspec ....

Usage of the `rspec/flaky/report-suite.json` report

The rspec/flaky/report-suite.json report is:

Used for automatically skipping known flaky tests.
Imported into Snowflake once per day, for monitoring with the internal dashboard.

Problems we had in the past at GitLab

Order-dependent flaky tests

These flaky tests can fail depending on the order they run with other tests. For example:

https://gitlab.com/gitlab-org/gitlab/-/issues/327668

To identify the tests that lead to such failure, we can use scripts/rspec_bisect_flaky, which would give us the minimal test combination to reproduce the failure:

First obtain the list of specs that ran before the flaky test. You can search for the list under Knapsack node specs: in the CI job output log.

Save the list of specs as a file, and run:

cat knapsack_specs.txt | xargs scripts/rspec_bisect_flaky

If there is an order-dependency issue, the script above will print the minimal reproduction.

Time-sensitive flaky tests

Hanging specs

If a spec hangs, it might be caused by a bug in Rails:

Resources

Return to Testing documentation

8.7 KiB Raw Blame History

Flaky tests

What's a flaky test?

Quarantined tests

Automatic retries and flaky tests detection

Usage of the rspec/flaky/report-suite.json report

Problems we had in the past at GitLab

Order-dependent flaky tests

Time-sensitive flaky tests

Array order expectation

Feature tests

Capybara viewport size related issues

Capybara JS driver related issues

Capybara expectation times out

Hanging specs

Resources

8.7 KiB

Raw Blame History

Usage of the `rspec/flaky/report-suite.json` report