f14647fdae
Previously `ProjectCacheWorker` would be scheduled once per ref, which would generate unnecessary I/O and load on Sidekiq, especially if many tags or branches were pushed at once. `ProjectCacheWorker` would expire three items: 1. Repository size: This only needs to be updated once per push. 2. Commit count: This only needs to be updated if the default branch is updated. 3. Project method caches: This only needs to be updated if the default branch changes, but only if certain files change (e.g. README, CHANGELOG, etc.). Because the third item requires looking at the actual changes in the commit deltas, we schedule one `ProjectCacheWorker` to handle the first two cases, and schedule a separate `ProjectCacheWorker` for the third case if it is needed. As a result, this brings down the number of `ProjectCacheWorker` jobs from N to 2. Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/52046
57 lines
2 KiB
Ruby
57 lines
2 KiB
Ruby
# frozen_string_literal: true
|
|
|
|
# Worker for updating any project specific caches.
|
|
class ProjectCacheWorker
|
|
include ApplicationWorker
|
|
|
|
LEASE_TIMEOUT = 15.minutes.to_i
|
|
|
|
# project_id - The ID of the project for which to flush the cache.
|
|
# files - An Array containing extra types of files to refresh such as
|
|
# `:readme` to flush the README and `:changelog` to flush the
|
|
# CHANGELOG.
|
|
# statistics - An Array containing columns from ProjectStatistics to
|
|
# refresh, if empty all columns will be refreshed
|
|
# refresh_statistics - A boolean that determines whether project statistics should
|
|
# be updated.
|
|
# rubocop: disable CodeReuse/ActiveRecord
|
|
def perform(project_id, files = [], statistics = [], refresh_statistics = true)
|
|
project = Project.find_by(id: project_id)
|
|
|
|
return unless project
|
|
|
|
update_statistics(project, statistics) if refresh_statistics
|
|
|
|
return unless project.repository.exists?
|
|
|
|
project.repository.refresh_method_caches(files.map(&:to_sym))
|
|
|
|
project.cleanup
|
|
end
|
|
# rubocop: enable CodeReuse/ActiveRecord
|
|
|
|
# NOTE: triggering both an immediate update and one in 15 minutes if we
|
|
# successfully obtain the lease. That way, we only need to wait for the
|
|
# statistics to become accurate if they were already updated once in the
|
|
# last 15 minutes.
|
|
def update_statistics(project, statistics = [])
|
|
return if Gitlab::Database.read_only?
|
|
return unless try_obtain_lease_for(project.id, statistics)
|
|
|
|
Projects::UpdateStatisticsService.new(project, nil, statistics: statistics).execute
|
|
|
|
UpdateProjectStatisticsWorker.perform_in(LEASE_TIMEOUT, project.id, statistics)
|
|
end
|
|
|
|
private
|
|
|
|
def try_obtain_lease_for(project_id, statistics)
|
|
Gitlab::ExclusiveLease
|
|
.new(project_cache_worker_key(project_id, statistics), timeout: LEASE_TIMEOUT)
|
|
.try_obtain
|
|
end
|
|
|
|
def project_cache_worker_key(project_id, statistics)
|
|
["project_cache_worker", project_id, *statistics.sort].join(":")
|
|
end
|
|
end
|