Commit graph

20 commits

Author SHA1 Message Date
Stan Hu
f14647fdae Expire project caches once per push instead of once per ref
Previously `ProjectCacheWorker` would be scheduled once per ref, which
would generate unnecessary I/O and load on Sidekiq, especially if many
tags or branches were pushed at once. `ProjectCacheWorker` would expire
three items:

1. Repository size: This only needs to be updated once per push.
2. Commit count: This only needs to be updated if the default branch
   is updated.
3. Project method caches: This only needs to be updated if the default
   branch changes, but only if certain files change (e.g. README,
   CHANGELOG, etc.).

Because the third item requires looking at the actual changes in the
commit deltas, we schedule one `ProjectCacheWorker` to handle the first
two cases, and schedule a separate `ProjectCacheWorker` for the third
case if it is needed. As a result, this brings down the number of
`ProjectCacheWorker` jobs from N to 2.

Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/52046
2019-08-16 19:53:56 +00:00
Mayra Cabrera
ec39a1d63d Schedule namespace aggregation in other contexts
Schedules a Namespace::AggregationSchedule worker if some of the project
statistics are refreshed.

The worker is only executed if the feature flag is enabled.
2019-07-08 15:06:05 +00:00
Peter Marko
40490cc492 Add wiki size to project statistics 2019-05-29 16:08:25 +02:00
Hiroyuki Sato
770f721962 Refactor: extract duplicate steps to a service class 2019-04-05 22:47:20 +09:00
Hiroyuki Sato
074a1797fe Update the project statistics immediatelly 2019-04-05 00:22:56 +09:00
Hiroyuki Sato
e6780501cb Refactor project_cache_worker_key 2019-04-05 00:22:56 +09:00
Hiroyuki Sato
0adedbb482 Fix the bug that the project statistics is not updated 2019-04-05 00:22:56 +09:00
Thong Kuah
d6b952ad3e
Add frozen_string_literal to spec/workers
Adds `# frozen_string_literal: true` to spec/workers ruby files
2019-04-01 13:35:22 -03:00
Sean McGivern
14d2b52b00 Revert "Merge branch '44726-cancel_lease_upon_completion_in_project_cache_worker' into 'master'"
This reverts merge request !20103
2018-07-04 11:04:58 +00:00
Imre Farkas
ae86fd96ae
Cancel ExclusiveLease upon completion in ProjectCacheWorker 2018-06-30 11:59:48 +02:00
Douglas Barbosa Alexandre
34dbccb24b
Add helper methods to stub Gitlab::ExclusiveLease 2018-06-28 19:24:40 -03:00
Grzegorz Bizon
0430b76441 Enable Style/DotPosition Rubocop 👮 2017-06-21 13:48:12 +00:00
Toon Claes
abc82a2508 Fix ProjectCacheWorker for plain READMEs
The ProjectCacheWorker refreshes cache periodically, but it runs outside Rails
context. So include the ActionView helpers so the `content_tag` method is
available.
2017-05-18 21:10:10 +02:00
Robert Speicher
68e6718932 Use :empty_project where possible in worker specs 2017-03-27 18:45:37 -04:00
Markus Koller
3ef4f74b1a
Add more storage statistics
This adds counters for build artifacts and LFS objects, and moves
the preexisting repository_size and commit_count from the projects
table into a new project_statistics table.

The counters are displayed in the administration area for projects
and groups, and also available through the API for admins (on */all)
and normal users (on */owned)

The statistics are updated through ProjectCacheWorker, which can now
do more granular updates with the new :statistics argument.
2016-12-21 16:39:49 +01:00
Yorick Peterse
ffb9b3ef18
Refactor cache refreshing/expiring
This refactors repository caching so it's possible to selectively
refresh certain caches, instead of just expiring and refreshing
everything.

To allow this the various methods that were cached (e.g. "tag_count" and
"readme") use a similar pattern that makes expiring and refreshing
their data much easier.

In this new setup caches are refreshed as follows:

1. After a commit (but before running ProjectCacheWorker) we expire some
   basic caches such as the commit count and repository size.

2. ProjectCacheWorker will recalculate the commit count, repository
   size, then refresh a specific set of caches based on the list of
   files changed in a push payload.

This requires a bunch of changes to the various methods that may be
cached. For one, data should not be cached if a branch used or the
entire repository does not exist. To prevent all these methods from
handling this manually this is taken care of in
Repository#cache_method_output. Some methods still manually check for
the existence of a repository but this result is also cached.

With selective flushing implemented ProjectCacheWorker no longer uses an
exclusive lease for all of its work. Instead this worker only uses a
lease to limit the number of times the repository size is updated as
this is a fairly expensive operation.
2016-11-21 15:05:13 +01:00
Yorick Peterse
3b4af59a5f
Don't schedule ProjectCacheWorker unless needed
This changes ProjectCacheWorker.perform_async so it only schedules a job
when no lease for the given project is present. This ensures we don't
end up scheduling hundreds of jobs when they won't be executed anyway.
2016-10-25 16:02:36 +02:00
Yorick Peterse
bc31a489dd
Restrict ProjectCacheWorker jobs to one per 15 min
This ensures ProjectCacheWorker jobs for a given project are performed
at most once per 15 minutes. This should reduce disk load a bit in cases
where there are multiple pushes happening (which should schedule
multiple ProjectCacheWorker jobs).
2016-10-20 13:20:47 +02:00
Grzegorz Bizon
9e211091a8 Enable Style/EmptyLines cop, remove redundant ones 2016-07-01 21:56:17 +02:00
Stan Hu
720ef51bd9 Check if repo exists before attempting to update cache info
Closes #14361
2016-03-27 06:17:49 -07:00