mirror of
https://github.com/mperham/sidekiq.git
synced 2022-11-09 13:52:34 -05:00
Better scheduling for large clusters, fixes #3889
Today we add 50% to the sleep time so that processes cluster around the target time: 50% --> target <-- 150% This works well for small clusters, e.g. less than 10 processes. The problem is that, for large clusters, the processes will never sleep less than 50% of (process_count * poll average) which breaks the average and delays job scheduling for several minutes. Instead, beyond 10 processes, don't add that 50% buffer. Allow the processes to sleep anywhere within the timespan: 0% --> target <-- 200% With many processes, the average sleep within a 200% time period should work out close enough to 100%.
This commit is contained in:
parent
2483e9d56d
commit
8a589d68fe
1 changed files with 32 additions and 3 deletions
|
@ -97,9 +97,34 @@ module Sidekiq
|
||||||
sleep 5
|
sleep 5
|
||||||
end
|
end
|
||||||
|
|
||||||
# Calculates a random interval that is ±50% the desired average.
|
|
||||||
def random_poll_interval
|
def random_poll_interval
|
||||||
poll_interval_average * rand + poll_interval_average.to_f / 2
|
# We want one Sidekiq process to schedule jobs every N seconds. We have M processes
|
||||||
|
# and **don't** want to coordinate.
|
||||||
|
#
|
||||||
|
# So in N*M second timespan, we want each process to schedule once. The basic loop is:
|
||||||
|
#
|
||||||
|
# * sleep # a random amount within that N*M timespan
|
||||||
|
# * wake up, schedule
|
||||||
|
#
|
||||||
|
# There are pathological edge cases:
|
||||||
|
#
|
||||||
|
# Imagine a set of 4 processes, scheduling every 5 seconds, so N*M = 20. Each process
|
||||||
|
# decides to randomly sleep 18 seconds, now we've failed to meet that 5 second average.
|
||||||
|
# Thankfully each schedule cycle will sleep randomly so the next iteration could see each
|
||||||
|
# process sleep for 1 second, undercutting our average.
|
||||||
|
#
|
||||||
|
# So below 10 processes, we special case and ensure the processes sleep closer to the average.
|
||||||
|
# As we run more processes, the scheduling interval average should approach the desired
|
||||||
|
# amount.
|
||||||
|
#
|
||||||
|
if process_count < 10
|
||||||
|
# For small clusters, # calculates a random interval that is ±50% the desired average.
|
||||||
|
poll_interval_average * rand + poll_interval_average.to_f / 2
|
||||||
|
else
|
||||||
|
# With 10+ processes, we should have enough randomness to get decent polling
|
||||||
|
# across the entire timespan
|
||||||
|
poll_interval_average * rand * 2
|
||||||
|
end
|
||||||
end
|
end
|
||||||
|
|
||||||
# We do our best to tune the poll interval to the size of the active Sidekiq
|
# We do our best to tune the poll interval to the size of the active Sidekiq
|
||||||
|
@ -123,9 +148,13 @@ module Sidekiq
|
||||||
# This minimizes a single point of failure by dispersing check-ins but without taxing
|
# This minimizes a single point of failure by dispersing check-ins but without taxing
|
||||||
# Redis if you run many Sidekiq processes.
|
# Redis if you run many Sidekiq processes.
|
||||||
def scaled_poll_interval
|
def scaled_poll_interval
|
||||||
|
process_count * Sidekiq.options[:average_scheduled_poll_interval]
|
||||||
|
end
|
||||||
|
|
||||||
|
def process_count
|
||||||
pcount = Sidekiq::ProcessSet.new.size
|
pcount = Sidekiq::ProcessSet.new.size
|
||||||
pcount = 1 if pcount == 0
|
pcount = 1 if pcount == 0
|
||||||
pcount * Sidekiq.options[:average_scheduled_poll_interval]
|
pcount
|
||||||
end
|
end
|
||||||
|
|
||||||
def initial_wait
|
def initial_wait
|
||||||
|
|
Loading…
Add table
Reference in a new issue