When a pipeline fails to create in `PostReceive`, the error is silently
discarded, making it difficult to understand why a pipeline was not
created. We now add a Sidekiq warning message for this. Adding a Sentry
exception when this happens would generate a lot of noise for invalid CI
files.
Relates to https://gitlab.com/gitlab-org/gitlab-ee/issues/14720
Previously `ProjectCacheWorker` would be scheduled once per ref, which
would generate unnecessary I/O and load on Sidekiq, especially if many
tags or branches were pushed at once. `ProjectCacheWorker` would expire
three items:
1. Repository size: This only needs to be updated once per push.
2. Commit count: This only needs to be updated if the default branch
is updated.
3. Project method caches: This only needs to be updated if the default
branch changes, but only if certain files change (e.g. README,
CHANGELOG, etc.).
Because the third item requires looking at the actual changes in the
commit deltas, we schedule one `ProjectCacheWorker` to handle the first
two cases, and schedule a separate `ProjectCacheWorker` for the third
case if it is needed. As a result, this brings down the number of
`ProjectCacheWorker` jobs from N to 2.
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/52046
Instead of checking if a commit already exists in the upstream project
in its ProcessCommitWorker and bailing out if it does, we check the
existence of all commits in bulk in Git::BranchHooksService, so that we
can skip scheduling ProcessCommitWorker jobs for those commits
that already exist upstream entirely.
Previously each tag in a push would invoke the Gitaly `FindAllTags` RPC
since the tag cache would be invalidated with every tag.
We can eliminate those extraneous calls by expiring the tag cache once
in `PostReceive` and taking advantage of the cached tags.
Relates to https://gitlab.com/gitlab-org/gitlab-ce/issues/65795
This commit reduces I/O load and memory utilization during PostReceive
for the common case when no project hooks or services are set up.
We saw a Gitaly N+1 issue in `CommitDelta` when many tags or branches
are pushed. We can reduce this overhead in the common case because we
observe that most new projects do not have any Web hooks or services,
especially when they are first created. Previously, `BaseHooksService`
unconditionally iterated through the last 20 commits of each ref to
build the `push_data` structure. The `push_data` structured was used in
numerous places:
1. Building the push payload in `EventCreateService`
2. Creating a CI pipeline
3. Executing project Web or system hooks
4. Executing project services
5. As the return value of `BaseHooksService#execute`
6. `BranchHooksService#invalidated_file_types`
We only need to generate the full `push_data` for items 3, 4, and 6.
Item 1: `EventCreateService` only needs the last commit and doesn't
actually need the commit deltas.
Item 2: In addition, `Ci::CreatePipelineService` only needed a subset of
the parameters.
Item 5: The return value of `BaseHooksService#execute` also wasn't being
used anywhere.
Item 6: This is only used when pushing to the default branch, so if
many tags are pushed we can save significant I/O here.
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/65878
Fic
In certain cases, GitLab can miss a PostReceive invocation the first
time a branch is pushed. When this happens, the "branch created" hooks
are not run, which means various features don't work until the branch
is deleted and pushed again.
This MR changes the `Git::BranchPushService` so it checks the cache of
existing branches in addition to the `oldrev` reported for the branch.
If the branch name isn't in the cache, chances are we haven't run the
service yet (it's what refreshes the cache), so we can go ahead and
run it, even through `oldrev` is set.
If the cache has been cleared by some other means in the meantime, then
we'll still fail to run the hooks when we should. Fixing that in the
general case is a larger problem, and we'd need to devote significant
engineering effort to it.
There's a chance that we'll run the relevant hooks *multiple times*
with this change, if there's a race between the branch being created,
and the `PostReceive` worker being run multiple times, but this can
already happen, since Sidekiq is "at-least-once" execution of jobs. So,
this should be safe.
Commit messages are not processed for references to issues when
creating the default branch on push. This was expected
behavior (probably to avoid performance problems when first pushing a
repository with thousands of commits). However, this is not an issue
because we always limit the number of commits to process to 100
regardless of whether we are creating the default branch or not.
Remote mirrors were only being updated after pushes to branches, not
tags. This change consolidates the functionality into
Git::BaseHooksService so that both tags and branches will now update
remote mirrors.
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/51240