2018-07-17 12:50:37 -04:00
|
|
|
# frozen_string_literal: true
|
|
|
|
|
2018-05-03 08:55:14 -04:00
|
|
|
module Projects
|
|
|
|
class UpdateRemoteMirrorService < BaseService
|
2020-09-21 08:09:34 -04:00
|
|
|
include Gitlab::Utils::StrongMemoize
|
|
|
|
|
Rework retry strategy for remote mirrors
**Prevention of running 2 simultaneous updates**
Instead of using `RemoteMirror#update_status` and raise an error if
it's already running to prevent the same mirror being updated at the
same time we now use `Gitlab::ExclusiveLease` for that.
When we fail to obtain a lease in 3 tries, 30 seconds apart, we bail
and reschedule. We'll reschedule faster for the protected branches.
If the mirror already ran since it was scheduled, the job will be
skipped.
**Error handling: Remote side**
When an update fails because of a `Gitlab::Git::CommandError`, we
won't track this error in sentry, this could be on the remote side:
for example when branches have diverged.
In this case, we'll try 3 times scheduled 1 or 5 minutes apart.
In between, the mirror is marked as "to_retry", the error would be
visible to the user when they visit the settings page.
After 3 tries we'll mark the mirror as failed and notify the user.
We won't track this error in sentry, as it's not likely we can help
it.
The next event that would trigger a new refresh.
**Error handling: our side**
If an unexpected error occurs, we mark the mirror as failed, but we'd
still retry the job based on the regular sidekiq retries with
backoff. Same as we used to
The error would be reported in sentry, since its likely we need to do
something about it.
2019-08-13 16:52:01 -04:00
|
|
|
MAX_TRIES = 3
|
2018-05-03 08:55:14 -04:00
|
|
|
|
Rework retry strategy for remote mirrors
**Prevention of running 2 simultaneous updates**
Instead of using `RemoteMirror#update_status` and raise an error if
it's already running to prevent the same mirror being updated at the
same time we now use `Gitlab::ExclusiveLease` for that.
When we fail to obtain a lease in 3 tries, 30 seconds apart, we bail
and reschedule. We'll reschedule faster for the protected branches.
If the mirror already ran since it was scheduled, the job will be
skipped.
**Error handling: Remote side**
When an update fails because of a `Gitlab::Git::CommandError`, we
won't track this error in sentry, this could be on the remote side:
for example when branches have diverged.
In this case, we'll try 3 times scheduled 1 or 5 minutes apart.
In between, the mirror is marked as "to_retry", the error would be
visible to the user when they visit the settings page.
After 3 tries we'll mark the mirror as failed and notify the user.
We won't track this error in sentry, as it's not likely we can help
it.
The next event that would trigger a new refresh.
**Error handling: our side**
If an unexpected error occurs, we mark the mirror as failed, but we'd
still retry the job based on the regular sidekiq retries with
backoff. Same as we used to
The error would be reported in sentry, since its likely we need to do
something about it.
2019-08-13 16:52:01 -04:00
|
|
|
def execute(remote_mirror, tries)
|
2018-05-03 08:55:14 -04:00
|
|
|
return success unless remote_mirror.enabled?
|
|
|
|
|
2021-03-30 05:10:51 -04:00
|
|
|
# Blocked URLs are a hard failure, no need to attempt to retry
|
2020-09-21 08:09:34 -04:00
|
|
|
if Gitlab::UrlBlocker.blocked_url?(normalized_url(remote_mirror.url))
|
2021-03-30 05:10:51 -04:00
|
|
|
hard_retry_or_fail(remote_mirror, _('The remote mirror URL is invalid.'), tries)
|
|
|
|
return error(remote_mirror.last_error)
|
2020-09-02 11:10:54 -04:00
|
|
|
end
|
|
|
|
|
Rework retry strategy for remote mirrors
**Prevention of running 2 simultaneous updates**
Instead of using `RemoteMirror#update_status` and raise an error if
it's already running to prevent the same mirror being updated at the
same time we now use `Gitlab::ExclusiveLease` for that.
When we fail to obtain a lease in 3 tries, 30 seconds apart, we bail
and reschedule. We'll reschedule faster for the protected branches.
If the mirror already ran since it was scheduled, the job will be
skipped.
**Error handling: Remote side**
When an update fails because of a `Gitlab::Git::CommandError`, we
won't track this error in sentry, this could be on the remote side:
for example when branches have diverged.
In this case, we'll try 3 times scheduled 1 or 5 minutes apart.
In between, the mirror is marked as "to_retry", the error would be
visible to the user when they visit the settings page.
After 3 tries we'll mark the mirror as failed and notify the user.
We won't track this error in sentry, as it's not likely we can help
it.
The next event that would trigger a new refresh.
**Error handling: our side**
If an unexpected error occurs, we mark the mirror as failed, but we'd
still retry the job based on the regular sidekiq retries with
backoff. Same as we used to
The error would be reported in sentry, since its likely we need to do
something about it.
2019-08-13 16:52:01 -04:00
|
|
|
update_mirror(remote_mirror)
|
2018-09-10 15:12:49 -04:00
|
|
|
|
Rework retry strategy for remote mirrors
**Prevention of running 2 simultaneous updates**
Instead of using `RemoteMirror#update_status` and raise an error if
it's already running to prevent the same mirror being updated at the
same time we now use `Gitlab::ExclusiveLease` for that.
When we fail to obtain a lease in 3 tries, 30 seconds apart, we bail
and reschedule. We'll reschedule faster for the protected branches.
If the mirror already ran since it was scheduled, the job will be
skipped.
**Error handling: Remote side**
When an update fails because of a `Gitlab::Git::CommandError`, we
won't track this error in sentry, this could be on the remote side:
for example when branches have diverged.
In this case, we'll try 3 times scheduled 1 or 5 minutes apart.
In between, the mirror is marked as "to_retry", the error would be
visible to the user when they visit the settings page.
After 3 tries we'll mark the mirror as failed and notify the user.
We won't track this error in sentry, as it's not likely we can help
it.
The next event that would trigger a new refresh.
**Error handling: our side**
If an unexpected error occurs, we mark the mirror as failed, but we'd
still retry the job based on the regular sidekiq retries with
backoff. Same as we used to
The error would be reported in sentry, since its likely we need to do
something about it.
2019-08-13 16:52:01 -04:00
|
|
|
success
|
|
|
|
rescue Gitlab::Git::CommandError => e
|
|
|
|
# This happens if one of the gitaly calls above fail, for example when
|
|
|
|
# branches have diverged, or the pre-receive hook fails.
|
2021-03-30 05:10:51 -04:00
|
|
|
hard_retry_or_fail(remote_mirror, e.message, tries)
|
2018-05-03 08:55:14 -04:00
|
|
|
|
Rework retry strategy for remote mirrors
**Prevention of running 2 simultaneous updates**
Instead of using `RemoteMirror#update_status` and raise an error if
it's already running to prevent the same mirror being updated at the
same time we now use `Gitlab::ExclusiveLease` for that.
When we fail to obtain a lease in 3 tries, 30 seconds apart, we bail
and reschedule. We'll reschedule faster for the protected branches.
If the mirror already ran since it was scheduled, the job will be
skipped.
**Error handling: Remote side**
When an update fails because of a `Gitlab::Git::CommandError`, we
won't track this error in sentry, this could be on the remote side:
for example when branches have diverged.
In this case, we'll try 3 times scheduled 1 or 5 minutes apart.
In between, the mirror is marked as "to_retry", the error would be
visible to the user when they visit the settings page.
After 3 tries we'll mark the mirror as failed and notify the user.
We won't track this error in sentry, as it's not likely we can help
it.
The next event that would trigger a new refresh.
**Error handling: our side**
If an unexpected error occurs, we mark the mirror as failed, but we'd
still retry the job based on the regular sidekiq retries with
backoff. Same as we used to
The error would be reported in sentry, since its likely we need to do
something about it.
2019-08-13 16:52:01 -04:00
|
|
|
error(e.message)
|
2021-04-26 08:09:44 -04:00
|
|
|
rescue StandardError => e
|
2021-03-30 05:10:51 -04:00
|
|
|
remote_mirror.hard_fail!(e.message)
|
Rework retry strategy for remote mirrors
**Prevention of running 2 simultaneous updates**
Instead of using `RemoteMirror#update_status` and raise an error if
it's already running to prevent the same mirror being updated at the
same time we now use `Gitlab::ExclusiveLease` for that.
When we fail to obtain a lease in 3 tries, 30 seconds apart, we bail
and reschedule. We'll reschedule faster for the protected branches.
If the mirror already ran since it was scheduled, the job will be
skipped.
**Error handling: Remote side**
When an update fails because of a `Gitlab::Git::CommandError`, we
won't track this error in sentry, this could be on the remote side:
for example when branches have diverged.
In this case, we'll try 3 times scheduled 1 or 5 minutes apart.
In between, the mirror is marked as "to_retry", the error would be
visible to the user when they visit the settings page.
After 3 tries we'll mark the mirror as failed and notify the user.
We won't track this error in sentry, as it's not likely we can help
it.
The next event that would trigger a new refresh.
**Error handling: our side**
If an unexpected error occurs, we mark the mirror as failed, but we'd
still retry the job based on the regular sidekiq retries with
backoff. Same as we used to
The error would be reported in sentry, since its likely we need to do
something about it.
2019-08-13 16:52:01 -04:00
|
|
|
raise e
|
|
|
|
end
|
|
|
|
|
|
|
|
private
|
|
|
|
|
2020-09-21 08:09:34 -04:00
|
|
|
def normalized_url(url)
|
|
|
|
strong_memoize(:normalized_url) do
|
|
|
|
CGI.unescape(Gitlab::UrlSanitizer.sanitize(url))
|
|
|
|
end
|
|
|
|
end
|
|
|
|
|
Rework retry strategy for remote mirrors
**Prevention of running 2 simultaneous updates**
Instead of using `RemoteMirror#update_status` and raise an error if
it's already running to prevent the same mirror being updated at the
same time we now use `Gitlab::ExclusiveLease` for that.
When we fail to obtain a lease in 3 tries, 30 seconds apart, we bail
and reschedule. We'll reschedule faster for the protected branches.
If the mirror already ran since it was scheduled, the job will be
skipped.
**Error handling: Remote side**
When an update fails because of a `Gitlab::Git::CommandError`, we
won't track this error in sentry, this could be on the remote side:
for example when branches have diverged.
In this case, we'll try 3 times scheduled 1 or 5 minutes apart.
In between, the mirror is marked as "to_retry", the error would be
visible to the user when they visit the settings page.
After 3 tries we'll mark the mirror as failed and notify the user.
We won't track this error in sentry, as it's not likely we can help
it.
The next event that would trigger a new refresh.
**Error handling: our side**
If an unexpected error occurs, we mark the mirror as failed, but we'd
still retry the job based on the regular sidekiq retries with
backoff. Same as we used to
The error would be reported in sentry, since its likely we need to do
something about it.
2019-08-13 16:52:01 -04:00
|
|
|
def update_mirror(remote_mirror)
|
|
|
|
remote_mirror.update_start!
|
2020-06-02 02:08:01 -04:00
|
|
|
|
2020-09-17 14:10:12 -04:00
|
|
|
# LFS objects must be sent first, or the push has dangling pointers
|
2022-01-05 01:13:32 -05:00
|
|
|
lfs_status = send_lfs_objects!(remote_mirror)
|
2020-09-17 14:10:12 -04:00
|
|
|
|
2021-07-26 05:09:00 -04:00
|
|
|
response = remote_mirror.update_repository
|
2022-01-05 01:13:32 -05:00
|
|
|
failed, failure_message = failure_status(lfs_status, response, remote_mirror)
|
|
|
|
|
|
|
|
# When the issue https://gitlab.com/gitlab-org/gitlab/-/issues/349262 is closed,
|
|
|
|
# we can move this block within failure_status.
|
|
|
|
if failed
|
|
|
|
remote_mirror.mark_as_failed!(failure_message)
|
|
|
|
else
|
|
|
|
remote_mirror.update_finish!
|
|
|
|
end
|
|
|
|
end
|
|
|
|
|
|
|
|
def failure_status(lfs_status, response, remote_mirror)
|
|
|
|
message = ''
|
|
|
|
failed = false
|
|
|
|
lfs_sync_failed = false
|
|
|
|
|
|
|
|
if lfs_status&.dig(:status) == :error
|
|
|
|
lfs_sync_failed = true
|
|
|
|
message += "Error synchronizing LFS files:"
|
|
|
|
message += "\n\n#{lfs_status[:message]}\n\n"
|
|
|
|
|
2022-05-06 11:09:03 -04:00
|
|
|
failed = Feature.enabled?(:remote_mirror_fail_on_lfs, project)
|
2022-01-05 01:13:32 -05:00
|
|
|
end
|
2018-05-03 08:55:14 -04:00
|
|
|
|
2020-04-28 17:09:35 -04:00
|
|
|
if response.divergent_refs.any?
|
2022-01-05 01:13:32 -05:00
|
|
|
message += "Some refs have diverged and have not been updated on the remote:"
|
2020-04-28 17:09:35 -04:00
|
|
|
message += "\n\n#{response.divergent_refs.join("\n")}"
|
2022-01-05 01:13:32 -05:00
|
|
|
failed = true
|
|
|
|
end
|
Rework retry strategy for remote mirrors
**Prevention of running 2 simultaneous updates**
Instead of using `RemoteMirror#update_status` and raise an error if
it's already running to prevent the same mirror being updated at the
same time we now use `Gitlab::ExclusiveLease` for that.
When we fail to obtain a lease in 3 tries, 30 seconds apart, we bail
and reschedule. We'll reschedule faster for the protected branches.
If the mirror already ran since it was scheduled, the job will be
skipped.
**Error handling: Remote side**
When an update fails because of a `Gitlab::Git::CommandError`, we
won't track this error in sentry, this could be on the remote side:
for example when branches have diverged.
In this case, we'll try 3 times scheduled 1 or 5 minutes apart.
In between, the mirror is marked as "to_retry", the error would be
visible to the user when they visit the settings page.
After 3 tries we'll mark the mirror as failed and notify the user.
We won't track this error in sentry, as it's not likely we can help
it.
The next event that would trigger a new refresh.
**Error handling: our side**
If an unexpected error occurs, we mark the mirror as failed, but we'd
still retry the job based on the regular sidekiq retries with
backoff. Same as we used to
The error would be reported in sentry, since its likely we need to do
something about it.
2019-08-13 16:52:01 -04:00
|
|
|
|
2022-01-05 01:13:32 -05:00
|
|
|
if message.present?
|
|
|
|
Gitlab::AppJsonLogger.info(message: "Error synching remote mirror",
|
|
|
|
project_id: project.id,
|
|
|
|
project_path: project.full_path,
|
|
|
|
remote_mirror_id: remote_mirror.id,
|
|
|
|
lfs_sync_failed: lfs_sync_failed,
|
|
|
|
divergent_ref_list: response.divergent_refs)
|
2020-04-28 17:09:35 -04:00
|
|
|
end
|
2022-01-05 01:13:32 -05:00
|
|
|
|
|
|
|
[failed, message]
|
Rework retry strategy for remote mirrors
**Prevention of running 2 simultaneous updates**
Instead of using `RemoteMirror#update_status` and raise an error if
it's already running to prevent the same mirror being updated at the
same time we now use `Gitlab::ExclusiveLease` for that.
When we fail to obtain a lease in 3 tries, 30 seconds apart, we bail
and reschedule. We'll reschedule faster for the protected branches.
If the mirror already ran since it was scheduled, the job will be
skipped.
**Error handling: Remote side**
When an update fails because of a `Gitlab::Git::CommandError`, we
won't track this error in sentry, this could be on the remote side:
for example when branches have diverged.
In this case, we'll try 3 times scheduled 1 or 5 minutes apart.
In between, the mirror is marked as "to_retry", the error would be
visible to the user when they visit the settings page.
After 3 tries we'll mark the mirror as failed and notify the user.
We won't track this error in sentry, as it's not likely we can help
it.
The next event that would trigger a new refresh.
**Error handling: our side**
If an unexpected error occurs, we mark the mirror as failed, but we'd
still retry the job based on the regular sidekiq retries with
backoff. Same as we used to
The error would be reported in sentry, since its likely we need to do
something about it.
2019-08-13 16:52:01 -04:00
|
|
|
end
|
|
|
|
|
2020-09-17 14:10:12 -04:00
|
|
|
def send_lfs_objects!(remote_mirror)
|
|
|
|
return unless project.lfs_enabled?
|
|
|
|
|
|
|
|
# TODO: Support LFS sync over SSH
|
|
|
|
# https://gitlab.com/gitlab-org/gitlab/-/issues/249587
|
2021-07-14 14:08:31 -04:00
|
|
|
return unless remote_mirror.url =~ %r{\Ahttps?://}i
|
2020-09-17 14:10:12 -04:00
|
|
|
return unless remote_mirror.password_auth?
|
|
|
|
|
|
|
|
Lfs::PushService.new(
|
|
|
|
project,
|
|
|
|
current_user,
|
|
|
|
url: remote_mirror.bare_url,
|
|
|
|
credentials: remote_mirror.credentials
|
|
|
|
).execute
|
|
|
|
end
|
|
|
|
|
2021-03-30 05:10:51 -04:00
|
|
|
def hard_retry_or_fail(mirror, message, tries)
|
Rework retry strategy for remote mirrors
**Prevention of running 2 simultaneous updates**
Instead of using `RemoteMirror#update_status` and raise an error if
it's already running to prevent the same mirror being updated at the
same time we now use `Gitlab::ExclusiveLease` for that.
When we fail to obtain a lease in 3 tries, 30 seconds apart, we bail
and reschedule. We'll reschedule faster for the protected branches.
If the mirror already ran since it was scheduled, the job will be
skipped.
**Error handling: Remote side**
When an update fails because of a `Gitlab::Git::CommandError`, we
won't track this error in sentry, this could be on the remote side:
for example when branches have diverged.
In this case, we'll try 3 times scheduled 1 or 5 minutes apart.
In between, the mirror is marked as "to_retry", the error would be
visible to the user when they visit the settings page.
After 3 tries we'll mark the mirror as failed and notify the user.
We won't track this error in sentry, as it's not likely we can help
it.
The next event that would trigger a new refresh.
**Error handling: our side**
If an unexpected error occurs, we mark the mirror as failed, but we'd
still retry the job based on the regular sidekiq retries with
backoff. Same as we used to
The error would be reported in sentry, since its likely we need to do
something about it.
2019-08-13 16:52:01 -04:00
|
|
|
if tries < MAX_TRIES
|
2021-03-30 05:10:51 -04:00
|
|
|
mirror.hard_retry!(message)
|
2018-05-03 08:55:14 -04:00
|
|
|
else
|
Rework retry strategy for remote mirrors
**Prevention of running 2 simultaneous updates**
Instead of using `RemoteMirror#update_status` and raise an error if
it's already running to prevent the same mirror being updated at the
same time we now use `Gitlab::ExclusiveLease` for that.
When we fail to obtain a lease in 3 tries, 30 seconds apart, we bail
and reschedule. We'll reschedule faster for the protected branches.
If the mirror already ran since it was scheduled, the job will be
skipped.
**Error handling: Remote side**
When an update fails because of a `Gitlab::Git::CommandError`, we
won't track this error in sentry, this could be on the remote side:
for example when branches have diverged.
In this case, we'll try 3 times scheduled 1 or 5 minutes apart.
In between, the mirror is marked as "to_retry", the error would be
visible to the user when they visit the settings page.
After 3 tries we'll mark the mirror as failed and notify the user.
We won't track this error in sentry, as it's not likely we can help
it.
The next event that would trigger a new refresh.
**Error handling: our side**
If an unexpected error occurs, we mark the mirror as failed, but we'd
still retry the job based on the regular sidekiq retries with
backoff. Same as we used to
The error would be reported in sentry, since its likely we need to do
something about it.
2019-08-13 16:52:01 -04:00
|
|
|
# It's not likely we'll be able to recover from this ourselves, so we'll
|
|
|
|
# notify the users of the problem, and don't trigger any sidekiq retries
|
|
|
|
# Instead, we'll wait for the next change to try the push again, or until
|
|
|
|
# a user manually retries.
|
2021-03-30 05:10:51 -04:00
|
|
|
mirror.hard_fail!(message)
|
2018-05-03 08:55:14 -04:00
|
|
|
end
|
|
|
|
end
|
|
|
|
end
|
|
|
|
end
|