Prune web hook logs older than 90 days

This adds a recurring Sidekiq job that removes up to 50 000 old web hook
logs per hour, if they are older than 90 days. This will prevent the
web_hook_logs table from growing indefinitely.

Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/46120
This commit is contained in:
Yorick Peterse 2018-06-26 18:24:42 +02:00
parent f25cdea64d
commit 75316348c5
No known key found for this signature in database
GPG key ID: EDD30D2BEB691AC9
6 changed files with 68 additions and 6 deletions

View file

@ -20,6 +20,7 @@
- cronjob:ci_archive_traces_cron - cronjob:ci_archive_traces_cron
- cronjob:trending_projects - cronjob:trending_projects
- cronjob:issue_due_scheduler - cronjob:issue_due_scheduler
- cronjob:prune_web_hook_logs
- gcp_cluster:cluster_install_app - gcp_cluster:cluster_install_app
- gcp_cluster:cluster_provision - gcp_cluster:cluster_provision

View file

@ -0,0 +1,26 @@
# frozen_string_literal: true
# Worker that deletes a fixed number of outdated rows from the "web_hook_logs"
# table.
class PruneWebHookLogsWorker
include ApplicationWorker
include CronjobQueue
# The maximum number of rows to remove in a single job.
DELETE_LIMIT = 50_000
def perform
# MySQL doesn't allow "DELETE FROM ... WHERE id IN ( ... )" if the inner
# query refers to the same table. To work around this we wrap the IN body in
# another sub query.
WebHookLog
.where(
'id IN (SELECT id FROM (?) ids_to_remove)',
WebHookLog
.select(:id)
.where('created_at < ?', 90.days.ago.beginning_of_day)
.limit(DELETE_LIMIT)
)
.delete_all
end
end

View file

@ -0,0 +1,5 @@
---
title: Prune web hook logs older than 90 days
merge_request:
author:
type: added

View file

@ -338,6 +338,10 @@ Settings.cron_jobs['issue_due_scheduler_worker'] ||= Settingslogic.new({})
Settings.cron_jobs['issue_due_scheduler_worker']['cron'] ||= '50 00 * * *' Settings.cron_jobs['issue_due_scheduler_worker']['cron'] ||= '50 00 * * *'
Settings.cron_jobs['issue_due_scheduler_worker']['job_class'] = 'IssueDueSchedulerWorker' Settings.cron_jobs['issue_due_scheduler_worker']['job_class'] = 'IssueDueSchedulerWorker'
Settings.cron_jobs['prune_web_hook_logs_worker'] ||= Settingslogic.new({})
Settings.cron_jobs['prune_web_hook_logs_worker']['cron'] ||= '0 */1 * * *'
Settings.cron_jobs['prune_web_hook_logs_worker']['job_class'] = 'PruneWebHookLogsWorker'
# #
# Sidekiq # Sidekiq
# #

View file

@ -6,6 +6,10 @@ Starting from GitLab 8.5:
- the `project.ssh_url` key is deprecated in favor of the `project.git_ssh_url` key - the `project.ssh_url` key is deprecated in favor of the `project.git_ssh_url` key
- the `project.http_url` key is deprecated in favor of the `project.git_http_url` key - the `project.http_url` key is deprecated in favor of the `project.git_http_url` key
>**Note:**
Starting from GitLab 11.1, the logs of web hooks are automatically removed after
one month.
Project webhooks allow you to trigger a URL if for example new code is pushed or Project webhooks allow you to trigger a URL if for example new code is pushed or
a new issue is created. You can configure webhooks to listen for specific events a new issue is created. You can configure webhooks to listen for specific events
like pushes, issues or merge requests. GitLab will send a POST request with data like pushes, issues or merge requests. GitLab will send a POST request with data

View file

@ -0,0 +1,22 @@
require 'spec_helper'
describe PruneWebHookLogsWorker do
describe '#perform' do
before do
hook = create(:project_hook)
5.times do
create(:web_hook_log, web_hook: hook, created_at: 5.months.ago)
end
create(:web_hook_log, web_hook: hook, response_status: '404')
end
it 'removes all web hook logs older than one month' do
described_class.new.perform
expect(WebHookLog.count).to eq(1)
expect(WebHookLog.first.response_status).to eq('404')
end
end
end