Port cleanup tasks to use Gitaly
Rake tasks cleaning up the Git storage were still using direct disk
access, which won't work if these aren't attached. To mitigate a
migration issue was created.
To port gitlab:cleanup:dirs, and gitlab:cleanup:repos, a new RPC was
required, ListDirectories. This was implemented in Gitaly, through
https://gitlab.com/gitlab-org/gitaly/merge_requests/868.
To be able to use the new RPC the Gitaly server was bumped to v0.120.
This is an RPC that will not use feature gates, as this doesn't scale on
.com so there is no way to test it at scale. Futhermore, we _know_ it
doesn't scale, but this might be a useful task for smaller instances.
Lastly, the tests are slightly updated to also work when the disk isn't
attached. Eventhough this is not planned, it was very little effort and
thus I applied the boy scout rule.
Closes https://gitlab.com/gitlab-org/gitaly/issues/954
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/40529
2018-09-07 05:16:34 -04:00
|
|
|
# frozen_string_literal: true
|
|
|
|
require 'set'
|
|
|
|
|
2012-12-24 23:14:05 -05:00
|
|
|
namespace :gitlab do
|
|
|
|
namespace :cleanup do
|
2015-06-23 10:52:40 -04:00
|
|
|
desc "GitLab | Cleanup | Clean namespaces"
|
2018-01-24 03:12:33 -05:00
|
|
|
task dirs: :gitlab_environment do
|
Port cleanup tasks to use Gitaly
Rake tasks cleaning up the Git storage were still using direct disk
access, which won't work if these aren't attached. To mitigate a
migration issue was created.
To port gitlab:cleanup:dirs, and gitlab:cleanup:repos, a new RPC was
required, ListDirectories. This was implemented in Gitaly, through
https://gitlab.com/gitlab-org/gitaly/merge_requests/868.
To be able to use the new RPC the Gitaly server was bumped to v0.120.
This is an RPC that will not use feature gates, as this doesn't scale on
.com so there is no way to test it at scale. Futhermore, we _know_ it
doesn't scale, but this might be a useful task for smaller instances.
Lastly, the tests are slightly updated to also work when the disk isn't
attached. Eventhough this is not planned, it was very little effort and
thus I applied the boy scout rule.
Closes https://gitlab.com/gitlab-org/gitaly/issues/954
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/40529
2018-09-07 05:16:34 -04:00
|
|
|
namespaces = Set.new(Namespace.pluck(:path))
|
2018-11-20 06:48:18 -05:00
|
|
|
namespaces << Storage::HashedProject::REPOSITORY_PATH_PREFIX
|
2012-12-24 23:14:05 -05:00
|
|
|
|
Port cleanup tasks to use Gitaly
Rake tasks cleaning up the Git storage were still using direct disk
access, which won't work if these aren't attached. To mitigate a
migration issue was created.
To port gitlab:cleanup:dirs, and gitlab:cleanup:repos, a new RPC was
required, ListDirectories. This was implemented in Gitaly, through
https://gitlab.com/gitlab-org/gitaly/merge_requests/868.
To be able to use the new RPC the Gitaly server was bumped to v0.120.
This is an RPC that will not use feature gates, as this doesn't scale on
.com so there is no way to test it at scale. Futhermore, we _know_ it
doesn't scale, but this might be a useful task for smaller instances.
Lastly, the tests are slightly updated to also work when the disk isn't
attached. Eventhough this is not planned, it was very little effort and
thus I applied the boy scout rule.
Closes https://gitlab.com/gitlab-org/gitaly/issues/954
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/40529
2018-09-07 05:16:34 -04:00
|
|
|
Gitaly::Server.all.each do |server|
|
|
|
|
all_dirs = Gitlab::GitalyClient::StorageService
|
|
|
|
.new(server.storage)
|
|
|
|
.list_directories(depth: 0)
|
|
|
|
.reject { |dir| dir.ends_with?('.git') || namespaces.include?(File.basename(dir)) }
|
2012-12-24 23:14:05 -05:00
|
|
|
|
2016-06-22 17:04:51 -04:00
|
|
|
puts "Looking for directories to remove... "
|
|
|
|
all_dirs.each do |dir_path|
|
2018-07-26 17:23:33 -04:00
|
|
|
if remove?
|
Port cleanup tasks to use Gitaly
Rake tasks cleaning up the Git storage were still using direct disk
access, which won't work if these aren't attached. To mitigate a
migration issue was created.
To port gitlab:cleanup:dirs, and gitlab:cleanup:repos, a new RPC was
required, ListDirectories. This was implemented in Gitaly, through
https://gitlab.com/gitlab-org/gitaly/merge_requests/868.
To be able to use the new RPC the Gitaly server was bumped to v0.120.
This is an RPC that will not use feature gates, as this doesn't scale on
.com so there is no way to test it at scale. Futhermore, we _know_ it
doesn't scale, but this might be a useful task for smaller instances.
Lastly, the tests are slightly updated to also work when the disk isn't
attached. Eventhough this is not planned, it was very little effort and
thus I applied the boy scout rule.
Closes https://gitlab.com/gitlab-org/gitaly/issues/954
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/40529
2018-09-07 05:16:34 -04:00
|
|
|
begin
|
|
|
|
Gitlab::GitalyClient::NamespaceService.new(server.storage)
|
|
|
|
.remove(dir_path)
|
|
|
|
|
|
|
|
puts "Removed...#{dir_path}"
|
|
|
|
rescue StandardError => e
|
|
|
|
puts "Cannot remove #{dir_path}: #{e.message}".color(:red)
|
2016-06-22 17:04:51 -04:00
|
|
|
end
|
2012-12-24 23:14:05 -05:00
|
|
|
else
|
2016-06-22 17:04:51 -04:00
|
|
|
puts "Can be removed: #{dir_path}".color(:red)
|
2012-12-24 23:14:05 -05:00
|
|
|
end
|
|
|
|
end
|
|
|
|
end
|
|
|
|
|
2018-07-26 17:23:33 -04:00
|
|
|
unless remove?
|
2016-06-01 18:37:15 -04:00
|
|
|
puts "To cleanup this directories run this command with REMOVE=true".color(:yellow)
|
2012-12-24 23:14:05 -05:00
|
|
|
end
|
|
|
|
end
|
|
|
|
|
2015-06-23 10:52:40 -04:00
|
|
|
desc "GitLab | Cleanup | Clean repositories"
|
2018-01-24 03:12:33 -05:00
|
|
|
task repos: :gitlab_environment do
|
2015-09-15 10:10:29 -04:00
|
|
|
move_suffix = "+orphaned+#{Time.now.to_i}"
|
2018-06-05 11:51:14 -04:00
|
|
|
|
Port cleanup tasks to use Gitaly
Rake tasks cleaning up the Git storage were still using direct disk
access, which won't work if these aren't attached. To mitigate a
migration issue was created.
To port gitlab:cleanup:dirs, and gitlab:cleanup:repos, a new RPC was
required, ListDirectories. This was implemented in Gitaly, through
https://gitlab.com/gitlab-org/gitaly/merge_requests/868.
To be able to use the new RPC the Gitaly server was bumped to v0.120.
This is an RPC that will not use feature gates, as this doesn't scale on
.com so there is no way to test it at scale. Futhermore, we _know_ it
doesn't scale, but this might be a useful task for smaller instances.
Lastly, the tests are slightly updated to also work when the disk isn't
attached. Eventhough this is not planned, it was very little effort and
thus I applied the boy scout rule.
Closes https://gitlab.com/gitlab-org/gitaly/issues/954
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/40529
2018-09-07 05:16:34 -04:00
|
|
|
Gitaly::Server.all.each do |server|
|
|
|
|
Gitlab::GitalyClient::StorageService
|
|
|
|
.new(server.storage)
|
|
|
|
.list_directories
|
|
|
|
.each do |path|
|
|
|
|
repo_with_namespace = path.chomp('.git').chomp('.wiki')
|
|
|
|
|
|
|
|
# TODO ignoring hashed repositories for now. But revisit to fully support
|
|
|
|
# possible orphaned hashed repos
|
2018-11-20 06:48:18 -05:00
|
|
|
next if repo_with_namespace.start_with?(Storage::HashedProject::REPOSITORY_PATH_PREFIX)
|
Port cleanup tasks to use Gitaly
Rake tasks cleaning up the Git storage were still using direct disk
access, which won't work if these aren't attached. To mitigate a
migration issue was created.
To port gitlab:cleanup:dirs, and gitlab:cleanup:repos, a new RPC was
required, ListDirectories. This was implemented in Gitaly, through
https://gitlab.com/gitlab-org/gitaly/merge_requests/868.
To be able to use the new RPC the Gitaly server was bumped to v0.120.
This is an RPC that will not use feature gates, as this doesn't scale on
.com so there is no way to test it at scale. Futhermore, we _know_ it
doesn't scale, but this might be a useful task for smaller instances.
Lastly, the tests are slightly updated to also work when the disk isn't
attached. Eventhough this is not planned, it was very little effort and
thus I applied the boy scout rule.
Closes https://gitlab.com/gitlab-org/gitaly/issues/954
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/40529
2018-09-07 05:16:34 -04:00
|
|
|
next if Project.find_by_full_path(repo_with_namespace)
|
2017-11-21 15:26:53 -05:00
|
|
|
|
Port cleanup tasks to use Gitaly
Rake tasks cleaning up the Git storage were still using direct disk
access, which won't work if these aren't attached. To mitigate a
migration issue was created.
To port gitlab:cleanup:dirs, and gitlab:cleanup:repos, a new RPC was
required, ListDirectories. This was implemented in Gitaly, through
https://gitlab.com/gitlab-org/gitaly/merge_requests/868.
To be able to use the new RPC the Gitaly server was bumped to v0.120.
This is an RPC that will not use feature gates, as this doesn't scale on
.com so there is no way to test it at scale. Futhermore, we _know_ it
doesn't scale, but this might be a useful task for smaller instances.
Lastly, the tests are slightly updated to also work when the disk isn't
attached. Eventhough this is not planned, it was very little effort and
thus I applied the boy scout rule.
Closes https://gitlab.com/gitlab-org/gitaly/issues/954
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/40529
2018-09-07 05:16:34 -04:00
|
|
|
new_path = path + move_suffix
|
|
|
|
puts path.inspect + ' -> ' + new_path.inspect
|
2017-11-14 04:02:39 -05:00
|
|
|
|
Port cleanup tasks to use Gitaly
Rake tasks cleaning up the Git storage were still using direct disk
access, which won't work if these aren't attached. To mitigate a
migration issue was created.
To port gitlab:cleanup:dirs, and gitlab:cleanup:repos, a new RPC was
required, ListDirectories. This was implemented in Gitaly, through
https://gitlab.com/gitlab-org/gitaly/merge_requests/868.
To be able to use the new RPC the Gitaly server was bumped to v0.120.
This is an RPC that will not use feature gates, as this doesn't scale on
.com so there is no way to test it at scale. Futhermore, we _know_ it
doesn't scale, but this might be a useful task for smaller instances.
Lastly, the tests are slightly updated to also work when the disk isn't
attached. Eventhough this is not planned, it was very little effort and
thus I applied the boy scout rule.
Closes https://gitlab.com/gitlab-org/gitaly/issues/954
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/40529
2018-09-07 05:16:34 -04:00
|
|
|
begin
|
|
|
|
Gitlab::GitalyClient::NamespaceService
|
|
|
|
.new(server.storage)
|
|
|
|
.rename(path, new_path)
|
|
|
|
rescue StandardError => e
|
2019-03-07 18:37:16 -05:00
|
|
|
puts "Error occurred while moving the repository: #{e.message}".color(:red)
|
2016-06-22 17:04:51 -04:00
|
|
|
end
|
2012-12-24 23:14:05 -05:00
|
|
|
end
|
|
|
|
end
|
|
|
|
end
|
2014-06-26 09:38:11 -04:00
|
|
|
|
2015-06-23 10:52:40 -04:00
|
|
|
desc "GitLab | Cleanup | Block users that have been removed in LDAP"
|
2018-01-24 03:12:33 -05:00
|
|
|
task block_removed_ldap_users: :gitlab_environment do
|
2014-06-26 09:38:11 -04:00
|
|
|
warn_user_is_not_gitlab
|
|
|
|
block_flag = ENV['BLOCK']
|
|
|
|
|
2015-02-16 04:00:25 -05:00
|
|
|
User.find_each do |user|
|
|
|
|
next unless user.ldap_user?
|
2017-11-14 04:02:39 -05:00
|
|
|
|
2015-02-16 04:00:25 -05:00
|
|
|
print "#{user.name} (#{user.ldap_identity.extern_uid}) ..."
|
2018-01-11 11:34:01 -05:00
|
|
|
|
2018-02-23 07:10:39 -05:00
|
|
|
if Gitlab::Auth::LDAP::Access.allowed?(user)
|
2016-06-01 18:37:15 -04:00
|
|
|
puts " [OK]".color(:green)
|
2014-06-26 09:38:11 -04:00
|
|
|
else
|
|
|
|
if block_flag
|
2015-02-16 04:00:25 -05:00
|
|
|
user.block! unless user.blocked?
|
2016-06-01 18:37:15 -04:00
|
|
|
puts " [BLOCKED]".color(:red)
|
2014-06-26 09:38:11 -04:00
|
|
|
else
|
2016-06-01 18:37:15 -04:00
|
|
|
puts " [NOT IN LDAP]".color(:yellow)
|
2014-06-26 09:38:11 -04:00
|
|
|
end
|
|
|
|
end
|
|
|
|
end
|
|
|
|
|
|
|
|
unless block_flag
|
2016-06-01 18:37:15 -04:00
|
|
|
puts "To block these users run this command with BLOCK=true".color(:yellow)
|
2014-06-26 09:38:11 -04:00
|
|
|
end
|
|
|
|
end
|
2018-07-26 17:23:33 -04:00
|
|
|
|
|
|
|
desc "GitLab | Cleanup | Clean orphaned project uploads"
|
|
|
|
task project_uploads: :gitlab_environment do
|
|
|
|
warn_user_is_not_gitlab
|
|
|
|
|
|
|
|
cleaner = Gitlab::Cleanup::ProjectUploads.new(logger: logger)
|
|
|
|
cleaner.run!(dry_run: dry_run?)
|
|
|
|
|
|
|
|
if dry_run?
|
|
|
|
logger.info "To clean up these files run this command with DRY_RUN=false".color(:yellow)
|
|
|
|
end
|
|
|
|
end
|
|
|
|
|
2018-07-30 14:14:38 -04:00
|
|
|
desc 'GitLab | Cleanup | Clean orphan remote upload files that do not exist in the db'
|
|
|
|
task remote_upload_files: :environment do
|
|
|
|
cleaner = Gitlab::Cleanup::RemoteUploads.new(logger: logger)
|
|
|
|
cleaner.run!(dry_run: dry_run?)
|
|
|
|
|
|
|
|
if dry_run?
|
|
|
|
logger.info "To cleanup these files run this command with DRY_RUN=false".color(:yellow)
|
|
|
|
end
|
|
|
|
end
|
|
|
|
|
2019-06-13 17:07:59 -04:00
|
|
|
desc 'GitLab | Cleanup | Clean orphan job artifact files'
|
|
|
|
task orphan_job_artifact_files: :gitlab_environment do
|
|
|
|
warn_user_is_not_gitlab
|
|
|
|
|
|
|
|
cleaner = Gitlab::Cleanup::OrphanJobArtifactFiles.new(limit: limit, dry_run: dry_run?, niceness: niceness, logger: logger)
|
|
|
|
cleaner.run!
|
|
|
|
|
|
|
|
if dry_run?
|
|
|
|
logger.info "To clean up these files run this command with DRY_RUN=false".color(:yellow)
|
|
|
|
end
|
|
|
|
end
|
|
|
|
|
2018-07-26 17:23:33 -04:00
|
|
|
def remove?
|
|
|
|
ENV['REMOVE'] == 'true'
|
|
|
|
end
|
|
|
|
|
|
|
|
def dry_run?
|
|
|
|
ENV['DRY_RUN'] != 'false'
|
|
|
|
end
|
|
|
|
|
2019-06-13 17:07:59 -04:00
|
|
|
def debug?
|
|
|
|
ENV['DEBUG'].present?
|
|
|
|
end
|
|
|
|
|
|
|
|
def limit
|
|
|
|
ENV['LIMIT']&.to_i
|
|
|
|
end
|
|
|
|
|
|
|
|
def niceness
|
|
|
|
ENV['NICENESS'].presence
|
|
|
|
end
|
|
|
|
|
2019-07-10 15:26:47 -04:00
|
|
|
# rubocop:disable Gitlab/RailsLogger
|
2018-07-26 17:23:33 -04:00
|
|
|
def logger
|
|
|
|
return @logger if defined?(@logger)
|
|
|
|
|
|
|
|
@logger = if Rails.env.development? || Rails.env.production?
|
|
|
|
Logger.new(STDOUT).tap do |stdout_logger|
|
|
|
|
stdout_logger.extend(ActiveSupport::Logger.broadcast(Rails.logger))
|
2019-06-13 17:07:59 -04:00
|
|
|
stdout_logger.level = debug? ? Logger::DEBUG : Logger::INFO
|
2018-07-26 17:23:33 -04:00
|
|
|
end
|
|
|
|
else
|
|
|
|
Rails.logger
|
|
|
|
end
|
|
|
|
end
|
2019-07-10 15:26:47 -04:00
|
|
|
# rubocop:enable Gitlab/RailsLogger
|
2012-12-24 23:14:05 -05:00
|
|
|
end
|
|
|
|
end
|