mperham--sidekiq/lib/sidekiq/manager.rb

# frozen_string_literal: true
require 'sidekiq/util'
require 'sidekiq/processor'
require 'sidekiq/fetch'
require 'thread'
require 'set'

module Sidekiq

  ##
  # The Manager is the central coordination point in Sidekiq, controlling
  # the lifecycle of the Processors.
  #
  # Tasks:
  #
  # 1. start: Spin up Processors.
  # 3. processor_died: Handle job failure, throw away Processor, create new one.
  # 4. quiet: shutdown idle Processors.
  # 5. stop: hard stop the Processors by deadline.
  #
  # Note that only the last task requires its own Thread since it has to monitor
  # the shutdown process.  The other tasks are performed by other threads.
  #
  class Manager
    include Util

    attr_reader :workers
    attr_reader :options

    def initialize(options={})
      logger.debug { options.inspect }
      @options = options
      @count = options[:concurrency] || 10
      raise ArgumentError, "Concurrency of #{@count} is not supported" if @count < 1

      @done = false
      @workers = Set.new
      @count.times do
        @workers << Processor.new(self)
      end
      @plock = Mutex.new
    end

    def start
      @workers.each do |x|
        x.start
      end
    end

    def quiet
      return if @done
      @done = true

      logger.info { "Terminating quiet workers" }
      @workers.each { |x| x.terminate }
      fire_event(:quiet, reverse: true)
    end

    # hack for quicker development / testing environment #2774
    PAUSE_TIME = STDOUT.tty? ? 0.1 : 0.5

    def stop(deadline)
      quiet
      fire_event(:shutdown, reverse: true)

      # some of the shutdown events can be async,
      # we don't have any way to know when they're done but
      # give them a little time to take effect
      sleep PAUSE_TIME
      return if @workers.empty?

      logger.info { "Pausing to allow workers to finish..." }
      remaining = deadline - Time.now
      while remaining > PAUSE_TIME
        return if @workers.empty?
        sleep PAUSE_TIME
        remaining = deadline - Time.now
      end
      return if @workers.empty?

      hard_shutdown
    end

    def processor_stopped(processor)
      @plock.synchronize do
        @workers.delete(processor)
      end
    end

    def processor_died(processor, reason)
      @plock.synchronize do
        @workers.delete(processor)
        unless @done
          p = Processor.new(self)
          @workers << p
          p.start
        end
      end
    end

    def stopped?
      @done
    end

    private

    def hard_shutdown
      # We've reached the timeout and we still have busy workers.
      # They must die but their jobs shall live on.
      cleanup = nil
      @plock.synchronize do
        cleanup = @workers.dup
      end

      if cleanup.size > 0
        jobs = cleanup.map {|p| p.job }.compact

        logger.warn { "Terminating #{cleanup.size} busy worker threads" }
        logger.warn { "Work still in progress #{jobs.inspect}" }

        # Re-enqueue unfinished jobs
        # NOTE: You may notice that we may push a job back to redis before
        # the worker thread is terminated. This is ok because Sidekiq's
        # contract says that jobs are run AT LEAST once. Process termination
        # is delayed until we're certain the jobs are back in Redis because
        # it is worse to lose a job than to run it twice.
        strategy = (@options[:fetch] || Sidekiq::BasicFetch)
        strategy.bulk_requeue(jobs, @options)
      end

      cleanup.each do |processor|
        processor.kill
      end
    end

  end
end
Put source encoding comment as line for (j)ruby 1.9 compatibility (#3255) In jruby 1.7.22 (1.9.3p551 compatibility mode), UTF-8 encoding is not properly detected, because the encoding comment is not on the first line as required in ruby 1.9. The frozen_string_literal magic comment did not come into existence until ruby 2.3, and ruby 1.9 does not look past the first line for magic comments. This results encoding-related syntax errors. Examples: SyntaxError: /home/nilbus/ws/rental_express/ROOT/rails/vendor/bundle/jruby/1.9/gems/sidekiq-4.2.6/lib/sidekiq.rb:52: Invalid char `\235' ('') in expression def self.â¨â¯Â°â¡Â°â©â¯ï¸µâ»ââ» ^ SyntaxError: /home/nilbus/ws/rental_express/ROOT/rails/vendor/bundle/jruby/1.9/gems/sidekiq-4.2.6/lib/sidekiq/api.rb:269: Invalid char `\237' ('') in expression alias_method :ð£, :clear ^ This patch should restore compatibility with ruby 1.9 and greater. 2016-11-22 23:39:00 -05:00			`# frozen_string_literal: true`
Holy crap, it boots Rails3 and actually sends messages to the workers! 2012-01-23 17:05:03 -05:00			`require 'sidekiq/util'`
Rename workers to processors. New Railtie support. Workers are the user's classes, the threads are now called processors. Add secret sauce to make Rails config much easier. Use a railtie to auto-add app/workers to the autoload path. 2012-01-25 16:32:51 -05:00			`require 'sidekiq/processor'`
Redesign message poll [WIP] Instead of using the manager to constantly poll the Redis server, use a dedicated Fetcher actor + BLPOP with a timeout. This should dramatically reduce Sidekiq's network chattiness. 2012-03-24 16:28:18 -04:00			`require 'sidekiq/fetch'`
Remove actor, update requires 2015-10-07 12:42:10 -04:00			`require 'thread'`
require set 2016-08-07 14:56:43 -04:00			`require 'set'`
Server starts up now! 2012-01-22 19:01:46 -05:00
Misc 2012-01-16 19:14:47 -05:00			`module Sidekiq`

			`##`
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`# The Manager is the central coordination point in Sidekiq, controlling`
Update Manager description (#3291) 2016-12-23 11:15:16 -05:00			`# the lifecycle of the Processors.`
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`#`
			`# Tasks:`
			`#`
code shuffling and cleanup 2015-10-09 18:33:42 -04:00			`# 1. start: Spin up Processors.`
			`# 3. processor_died: Handle job failure, throw away Processor, create new one.`
			`# 4. quiet: shutdown idle Processors.`
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`# 5. stop: hard stop the Processors by deadline.`
			`#`
code shuffling and cleanup 2015-10-09 18:33:42 -04:00			`# Note that only the last task requires its own Thread since it has to monitor`
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`# the shutdown process. The other tasks are performed by other threads.`
Misc 2012-01-16 19:14:47 -05:00			`#`
Add full multithreaded integration test for manager 2012-02-03 13:02:57 -05:00			`class Manager`
worker mgmt and msg dispatch 2012-01-22 14:32:38 -05:00			`include Util`

Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`attr_reader :workers`
Cleanup, tests passing 2015-10-08 12:37:37 -04:00			`attr_reader :options`
Add Sidekiq::Actor which provides a testable alternative to Celluloid 2013-05-10 23:43:53 -04:00
Cleanup, tests passing 2015-10-08 12:37:37 -04:00			`def initialize(options={})`
Update Sidekiq logging to use standard Ruby logger 2012-02-14 12:00:26 -05:00			`logger.debug { options.inspect }`
Move bulk_requeue back to class method. 2013-11-23 12:53:39 -05:00			`@options = options`
Update default concurrency per 5.2 changes 2018-09-17 13:10:27 -04:00			`@count = options[:concurrency] \|\| 10`
prevent invalid concurrency from causing hard to debug problems cocurrency 0 makes the manager just sit there and do nothing, which is hard to debug / understand without knowing what to look for 2015-08-12 13:16:06 -04:00			`raise ArgumentError, "Concurrency of #{@count} is not supported" if @count < 1`
worker mgmt and msg dispatch 2012-01-22 14:32:38 -05:00
Server starts up now! 2012-01-22 19:01:46 -05:00			`@done = false`
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`@workers = Set.new`
			`@count.times do`
Cleanup, tests passing 2015-10-08 12:37:37 -04:00			`@workers << Processor.new(self)`
Dont use #tap on Actors! 2013-06-12 18:16:19 -04:00			`end`
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`@plock = Mutex.new`
worker mgmt and msg dispatch 2012-01-22 14:32:38 -05:00			`end`

WIP manager and launcher 2015-10-06 15:43:01 -04:00			`def start`
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`@workers.each do \|x\|`
Manual testing fixes 2015-10-06 17:45:10 -04:00			`x.start`
Don't dispatch until all Processors are spun up 2015-10-07 11:37:51 -04:00			`end`
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`end`
Implement USR1 - stop accepting new work, GH-69 2012-03-08 23:58:51 -05:00
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`def quiet`
			`return if @done`
			`@done = true`
worker mgmt and msg dispatch 2012-01-22 14:32:38 -05:00
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`logger.info { "Terminating quiet workers" }`
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`@workers.each { \|x\| x.terminate }`
Exceptions raised during the startup event should kill the process, fixes #3717 2018-01-11 12:37:55 -05:00			`fire_event(:quiet, reverse: true)`
worker mgmt and msg dispatch 2012-01-22 14:32:38 -05:00			`end`

Improve shutdown speed a bit, fixes #2774 2016-01-19 12:14:27 -05:00			`# hack for quicker development / testing environment #2774`
			`PAUSE_TIME = STDOUT.tty? ? 0.1 : 0.5`
Hack to speed up the test suite just a bit 2016-01-06 13:00:18 -05:00
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`def stop(deadline)`
			`quiet`
Exceptions raised during the startup event should kill the process, fixes #3717 2018-01-11 12:37:55 -05:00			`fire_event(:shutdown, reverse: true)`
Manager now fires stop-related events Gives more fine-grained control as to when the actual event fires. We also give a small bit of time for the event processors to take effect for those handlers with asynchronous side effects (like shutting down other threads and subsystems). 2015-10-30 17:50:44 -04:00
			`# some of the shutdown events can be async,`
			`# we don't have any way to know when they're done but`
			`# give them a little time to take effect`
Hack to speed up the test suite just a bit 2016-01-06 13:00:18 -05:00			`sleep PAUSE_TIME`
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`return if @workers.empty?`
Groundwork for transfering reliable fetch working queue back to public queue on graceful shutdown. 2013-11-21 06:09:30 -05:00
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`logger.info { "Pausing to allow workers to finish..." }`
			`remaining = deadline - Time.now`
Hack to speed up the test suite just a bit 2016-01-06 13:00:18 -05:00			`while remaining > PAUSE_TIME`
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`return if @workers.empty?`
Hack to speed up the test suite just a bit 2016-01-06 13:00:18 -05:00			`sleep PAUSE_TIME`
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`remaining = deadline - Time.now`
			`end`
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`return if @workers.empty?`
get it working 2012-01-16 19:18:36 -05:00
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`hard_shutdown`
Add full multithreaded integration test for manager 2012-02-03 13:02:57 -05:00			`end`

Fix quick shutdown, more cleanup 2015-10-08 12:48:28 -04:00			`def processor_stopped(processor)`
			`@plock.synchronize do`
			`@workers.delete(processor)`
			`end`
			`end`

Rename workers to processors. New Railtie support. Workers are the user's classes, the threads are now called processors. Add secret sauce to make Rails config much easier. Use a railtie to auto-add app/workers to the autoload path. 2012-01-25 16:32:51 -05:00			`def processor_died(processor, reason)`
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`@plock.synchronize do`
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`@workers.delete(processor)`
cleanup 2015-10-07 12:47:53 -04:00			`unless @done`
Cleanup, tests passing 2015-10-08 12:37:37 -04:00			`p = Processor.new(self)`
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`@workers << p`
Manual testing fixes 2015-10-06 17:45:10 -04:00			`p.start`
Redesign message poll [WIP] Instead of using the manager to constantly poll the Redis server, use a dedicated Fetcher actor + BLPOP with a timeout. This should dramatically reduce Sidekiq's network chattiness. 2012-03-24 16:28:18 -04:00			`end`
Server starts up now! 2012-01-22 19:01:46 -05:00			`end`
Extract procline feature from Manager to Launcher. We don't want to touch the process name if we're integrating Sidekiq into existing process, so better not to put procline assignment in Manager, but in CLI where we launch standalone Sidekiq process. 2013-01-29 11:43:44 -05:00			`end`

Refactor proctitle so it can be extended 2015-06-09 16:09:53 -04:00			`def stopped?`
			`@done`
			`end`

Redesign message poll [WIP] Instead of using the manager to constantly poll the Redis server, use a dedicated Fetcher actor + BLPOP with a timeout. This should dramatically reduce Sidekiq's network chattiness. 2012-03-24 16:28:18 -04:00			`private`
worker mgmt and msg dispatch 2012-01-22 14:32:38 -05:00
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`def hard_shutdown`
			`# We've reached the timeout and we still have busy workers.`
Manual testing fixes 2015-10-06 17:45:10 -04:00			`# They must die but their jobs shall live on.`
			`cleanup = nil`
			`@plock.synchronize do`
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`cleanup = @workers.dup`
Manual testing fixes 2015-10-06 17:45:10 -04:00			`end`
Add hard shutdown with pushback to Redis, fixes #110 2012-04-06 23:53:03 -04:00
Manual testing fixes 2015-10-06 17:45:10 -04:00			`if cleanup.size > 0`
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`jobs = cleanup.map {\|p\| p.job }.compact`

Manual testing fixes 2015-10-06 17:45:10 -04:00			`logger.warn { "Terminating #{cleanup.size} busy worker threads" }`
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`logger.warn { "Work still in progress #{jobs.inspect}" }`

Manual testing fixes 2015-10-06 17:45:10 -04:00			`# Re-enqueue unfinished jobs`
			`# NOTE: You may notice that we may push a job back to redis before`
			`# the worker thread is terminated. This is ok because Sidekiq's`
			`# contract says that jobs are run AT LEAST once. Process termination`
			`# is delayed until we're certain the jobs are back in Redis because`
			`# it is worse to lose a job than to run it twice.`
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`strategy = (@options[:fetch] \|\| Sidekiq::BasicFetch)`
			`strategy.bulk_requeue(jobs, @options)`
Manual testing fixes 2015-10-06 17:45:10 -04:00			`end`
Add hard shutdown with pushback to Redis, fixes #110 2012-04-06 23:53:03 -04:00
Move fetching into the processor This removes thread context switching and network delay. 2015-10-07 15:21:10 -04:00			`cleanup.each do \|processor\|`
WIP manager and launcher 2015-10-06 15:43:01 -04:00			`processor.kill`
Add hard shutdown with pushback to Redis, fixes #110 2012-04-06 23:53:03 -04:00			`end`
			`end`

Misc 2012-01-16 19:14:47 -05:00			`end`
			`end`