mperham--sidekiq/lib/sidekiq/middleware/server/retry_jobs.rb

require 'sidekiq/scheduled'

module Sidekiq
  module Middleware
    module Server
      ##
      # Automatically retry jobs that fail in Sidekiq.
      # Sidekiq's retry support assumes a typical development lifecycle:
      # 0. push some code changes with a bug in it
      # 1. bug causes message processing to fail, sidekiq's middleware captures
      #    the message and pushes it onto a retry queue
      # 2. sidekiq retries messages in the retry queue multiple times with
      #    an exponential delay, the message continues to fail
      # 3. after a few days, a developer deploys a fix.  the message is
      #    reprocessed successfully.
      # 4. if 3 never happens, sidekiq will eventually give up and throw the
      #    message away. If the worker defines a method called 'retries_exhausted',
      #    this will be called before throwing the message away. If the
      #    'retries_exhausted' method throws an exception, it's dropped and logged.
      #
      # A message looks like:
      #
      #     { 'class' => 'HardWorker', 'args' => [1, 2, 'foo'] }
      #
      # The 'retry' option also accepts a number (in place of 'true'):
      #
      #     { 'class' => 'HardWorker', 'args' => [1, 2, 'foo'], 'retry' => 5 }
      #
      # The job will be retried this number of times before giving up. (If simply
      # 'true', Sidekiq retries 25 times)
      #
      # We'll add a bit more data to the message to support retries:
      #
      #  * 'queue' - the queue to use
      #  * 'retry_count' - number of times we've retried so far.
      #  * 'error_message' - the message from the exception
      #  * 'error_class' - the exception class
      #  * 'failed_at' - the first time it failed
      #  * 'retried_at' - the last time it was retried
      #
      # We don't store the backtrace as that can add a lot of overhead
      # to the message and everyone is using Airbrake, right?
      class RetryJobs
        include Sidekiq::Util

        DEFAULT_MAX_RETRY_ATTEMPTS = 25

        def call(worker, msg, queue)
          yield
        rescue Sidekiq::Shutdown
          # ignore, will be pushed back onto queue during hard_shutdown
          raise
        rescue Exception => e
          raise e unless msg['retry']
          max_retry_attempts = retry_attempts_from(msg['retry'], DEFAULT_MAX_RETRY_ATTEMPTS)

          msg['queue'] = if msg['retry_queue']
            msg['retry_queue']
          else
            queue
          end
          msg['error_message'] = e.message
          msg['error_class'] = e.class.name
          count = if msg['retry_count']
            msg['retried_at'] = Time.now.utc
            msg['retry_count'] += 1
          else
            msg['failed_at'] = Time.now.utc
            msg['retry_count'] = 0
          end

          if msg['backtrace'] == true
            msg['error_backtrace'] = e.backtrace
          elsif msg['backtrace'] == false
            # do nothing
          elsif msg['backtrace'].to_i != 0
            msg['error_backtrace'] = e.backtrace[0..msg['backtrace'].to_i]
          end

          if count < max_retry_attempts
            delay = delay_for(worker, count)
            logger.debug { "Failure! Retry #{count} in #{delay} seconds" }
            retry_at = Time.now.to_f + delay
            payload = Sidekiq.dump_json(msg)
            Sidekiq.redis do |conn|
              conn.zadd('retry', retry_at.to_s, payload)
            end
          else
            # Goodbye dear message, you (re)tried your best I'm sure.
            retries_exhausted(worker, msg)
          end

          raise e
        end

        def retries_exhausted(worker, msg)
          logger.debug { "Dropping message after hitting the retry maximum: #{msg}" }
          if worker.respond_to?(:retries_exhausted)
            logger.warn { "Defining #{worker.class.name}#retries_exhausted as a method is deprecated, use `sidekiq_retries_exhausted` callback instead http://git.io/Ijju8g" }
            worker.retries_exhausted(*msg['args'])
          elsif worker.sidekiq_retries_exhausted_block?
            worker.sidekiq_retries_exhausted_block.call(*msg['args'])
          end

        rescue Exception => e
          handle_exception(e, "Error calling retries_exhausted")
        end

        def retry_attempts_from(msg_retry, default)
          if msg_retry.is_a?(Fixnum)
            msg_retry
          else
            default
          end
        end

        def delay_for(worker, count)
          worker.sidekiq_retry_in_block? && retry_in(worker, count) || seconds_to_delay(count)
        end

        # delayed_job uses the same basic formula
        def seconds_to_delay(count)
          (count ** 4) + 15 + (rand(30)*(count+1))
        end

        def retry_in(worker, count)
          begin
            worker.sidekiq_retry_in_block.call(count)
          rescue Exception => e
            logger.error { "Failure scheduling retry using the defined `sidekiq_retry_in` in #{worker.class.name}, falling back to default: #{e.message}"}
            nil
          end
        end

      end
    end
  end
end
Scheduled jobs! Bump to 2.0.0. Performs can now be scheduled at arbitrary points in the future. 2012-05-25 23:21:42 -04:00			`require 'sidekiq/scheduled'`
HOT new automatic retry feature. Needs testing. 2012-03-17 16:41:53 -04:00
			`module Sidekiq`
			`module Middleware`
			`module Server`
			`##`
			`# Automatically retry jobs that fail in Sidekiq.`
Scheduled jobs! Bump to 2.0.0. Performs can now be scheduled at arbitrary points in the future. 2012-05-25 23:21:42 -04:00			`# Sidekiq's retry support assumes a typical development lifecycle:`
			`# 0. push some code changes with a bug in it`
			`# 1. bug causes message processing to fail, sidekiq's middleware captures`
			`# the message and pushes it onto a retry queue`
			`# 2. sidekiq retries messages in the retry queue multiple times with`
			`# an exponential delay, the message continues to fail`
			`# 3. after a few days, a developer deploys a fix. the message is`
			`# reprocessed successfully.`
			`# 4. if 3 never happens, sidekiq will eventually give up and throw the`
Documentation update 'exhausted' to 'retries_exhausted' 2013-03-21 15:49:05 -04:00			`# message away. If the worker defines a method called 'retries_exhausted',`
			`# this will be called before throwing the message away. If the`
			`# 'retries_exhausted' method throws an exception, it's dropped and logged.`
Scheduled jobs! Bump to 2.0.0. Performs can now be scheduled at arbitrary points in the future. 2012-05-25 23:21:42 -04:00			`#`
HOT new automatic retry feature. Needs testing. 2012-03-17 16:41:53 -04:00			`# A message looks like:`
			`#`
			`# { 'class' => 'HardWorker', 'args' => [1, 2, 'foo'] }`
			`#`
use 'retry' option to customize max retry attempts e.g. { 'class' => 'HardWorker', 'args' => [1, 2, 'foo'], 'retry' => 5 } Addresses Issue #313 2012-10-17 18:51:26 -04:00			`# The 'retry' option also accepts a number (in place of 'true'):`
			`#`
			`# { 'class' => 'HardWorker', 'args' => [1, 2, 'foo'], 'retry' => 5 }`
			`#`
			`# The job will be retried this number of times before giving up. (If simply`
			`# 'true', Sidekiq retries 25 times)`
			`#`
HOT new automatic retry feature. Needs testing. 2012-03-17 16:41:53 -04:00			`# We'll add a bit more data to the message to support retries:`
			`#`
			`# * 'queue' - the queue to use`
			`# * 'retry_count' - number of times we've retried so far.`
			`# * 'error_message' - the message from the exception`
			`# * 'error_class' - the exception class`
			`# * 'failed_at' - the first time it failed`
			`# * 'retried_at' - the last time it was retried`
			`#`
			`# We don't store the backtrace as that can add a lot of overhead`
			`# to the message and everyone is using Airbrake, right?`
			`class RetryJobs`
Auto failure retry now working! 2012-03-18 02:04:31 -04:00			`include Sidekiq::Util`
Scheduled jobs! Bump to 2.0.0. Performs can now be scheduled at arbitrary points in the future. 2012-05-25 23:21:42 -04:00
use 'retry' option to customize max retry attempts e.g. { 'class' => 'HardWorker', 'args' => [1, 2, 'foo'], 'retry' => 5 } Addresses Issue #313 2012-10-17 18:51:26 -04:00			`DEFAULT_MAX_RETRY_ATTEMPTS = 25`
Auto failure retry now working! 2012-03-18 02:04:31 -04:00
HOT new automatic retry feature. Needs testing. 2012-03-17 16:41:53 -04:00			`def call(worker, msg, queue)`
			`yield`
Ignore Shutdown exception in retry middleware, fixes #897 2013-05-03 12:50:20 -04:00			`rescue Sidekiq::Shutdown`
Avoid calling processor during hard shutdown, fixes #997 2013-06-11 01:20:15 -04:00			`# ignore, will be pushed back onto queue during hard_shutdown`
Ignore Shutdown exception in retry middleware, fixes #897 2013-05-03 12:50:20 -04:00			`raise`
Fix a few more Exception leaks 2012-08-29 23:20:20 -04:00			`rescue Exception => e`
			`raise e unless msg['retry']`
use 'retry' option to customize max retry attempts e.g. { 'class' => 'HardWorker', 'args' => [1, 2, 'foo'], 'retry' => 5 } Addresses Issue #313 2012-10-17 18:51:26 -04:00			`max_retry_attempts = retry_attempts_from(msg['retry'], DEFAULT_MAX_RETRY_ATTEMPTS)`
Client API update: - Add API for configuring options per Worker class - Removed the Client API issues preventing it working on Ruby 1.8 - Cleanups to various APIs for upcoming 1.0 release. 2012-04-01 22:53:45 -04:00
Allow specification of retry_queue in sidekiq_options 2013-01-15 20:28:52 -05:00			`msg['queue'] = if msg['retry_queue']`
			`msg['retry_queue']`
			`else`
			`queue`
			`end`
HOT new automatic retry feature. Needs testing. 2012-03-17 16:41:53 -04:00			`msg['error_message'] = e.message`
			`msg['error_class'] = e.class.name`
			`count = if msg['retry_count']`
			`msg['retried_at'] = Time.now.utc`
			`msg['retry_count'] += 1`
			`else`
			`msg['failed_at'] = Time.now.utc`
			`msg['retry_count'] = 0`
			`end`

Implement optional backtrace storage [#155] 2012-04-27 23:25:46 -04:00			`if msg['backtrace'] == true`
			`msg['error_backtrace'] = e.backtrace`
Explicitly test false to avoid warning on Ruby 2.0, fixes #869 2013-04-24 13:53:02 -04:00			`elsif msg['backtrace'] == false`
			`# do nothing`
Implement optional backtrace storage [#155] 2012-04-27 23:25:46 -04:00			`elsif msg['backtrace'].to_i != 0`
			`msg['error_backtrace'] = e.backtrace[0..msg['backtrace'].to_i]`
			`end`

Fix retry off by one, #512 2012-11-10 00:18:02 -05:00			`if count < max_retry_attempts`
Refactor sidekiq_retry_in impl 2013-06-26 00:10:46 -04:00			`delay = delay_for(worker, count)`
Auto failure retry now working! 2012-03-18 02:04:31 -04:00			`logger.debug { "Failure! Retry #{count} in #{delay} seconds" }`
			`retry_at = Time.now.to_f + delay`
Refactor to use Sidekiq.dump_json and Sidekiq.load_json These methods perform MultiJson feature detection and can be removed after this library's MultiJson dependency is upgraded to ~> 2.0. 2012-04-22 17:02:35 -04:00			`payload = Sidekiq.dump_json(msg)`
HOT new automatic retry feature. Needs testing. 2012-03-17 16:41:53 -04:00			`Sidekiq.redis do \|conn\|`
Auto failure retry now working! 2012-03-18 02:04:31 -04:00			`conn.zadd('retry', retry_at.to_s, payload)`
HOT new automatic retry feature. Needs testing. 2012-03-17 16:41:53 -04:00			`end`
			`else`
Auto failure retry now working! 2012-03-18 02:04:31 -04:00			`# Goodbye dear message, you (re)tried your best I'm sure.`
actually test exhausted feature 2013-03-21 17:37:41 -04:00			`retries_exhausted(worker, msg)`
HOT new automatic retry feature. Needs testing. 2012-03-17 16:41:53 -04:00			`end`
Call 'exhausted' worker method after retries When the maximum number of retries is hit and the message is about to be thrown away, give the option of allowing the worker to say goodbye by defining an 'exhausted' method on the worker. 2013-03-18 17:20:28 -04:00
Fix a few more Exception leaks 2012-08-29 23:20:20 -04:00			`raise e`
HOT new automatic retry feature. Needs testing. 2012-03-17 16:41:53 -04:00			`end`
Auto failure retry now working! 2012-03-18 02:04:31 -04:00
actually test exhausted feature 2013-03-21 17:37:41 -04:00			`def retries_exhausted(worker, msg)`
'exhausted' changed to 'retries_exhausted' Additionally, any errors raised during retries_exhausted hook are logged and dropped before resuming original control flow. 2013-03-21 14:16:07 -04:00			`logger.debug { "Dropping message after hitting the retry maximum: #{msg}" }`
Allow retries_exhausted to be defined in a block sidekiq_retries_exhausted { \|*args\| } 2013-06-26 13:48:24 -04:00			`if worker.respond_to?(:retries_exhausted)`
Add link to Error-Handling wiki page 2013-06-27 11:58:23 -04:00			logger.warn { "Defining #{worker.class.name}#retries_exhausted as a method is deprecated, use `sidekiq_retries_exhausted` callback instead http://git.io/Ijju8g" }
Allow retries_exhausted to be defined in a block sidekiq_retries_exhausted { \|*args\| } 2013-06-26 13:48:24 -04:00			`worker.retries_exhausted(*msg['args'])`
			`elsif worker.sidekiq_retries_exhausted_block?`
			`worker.sidekiq_retries_exhausted_block.call(*msg['args'])`
			`end`
'exhausted' changed to 'retries_exhausted' Additionally, any errors raised during retries_exhausted hook are logged and dropped before resuming original control flow. 2013-03-21 14:16:07 -04:00
			`rescue Exception => e`
actually test exhausted feature 2013-03-21 17:37:41 -04:00			`handle_exception(e, "Error calling retries_exhausted")`
'exhausted' changed to 'retries_exhausted' Additionally, any errors raised during retries_exhausted hook are logged and dropped before resuming original control flow. 2013-03-21 14:16:07 -04:00			`end`

use 'retry' option to customize max retry attempts e.g. { 'class' => 'HardWorker', 'args' => [1, 2, 'foo'], 'retry' => 5 } Addresses Issue #313 2012-10-17 18:51:26 -04:00			`def retry_attempts_from(msg_retry, default)`
			`if msg_retry.is_a?(Fixnum)`
			`msg_retry`
			`else`
			`default`
			`end`
			`end`

Refactor sidekiq_retry_in impl 2013-06-26 00:10:46 -04:00			`def delay_for(worker, count)`
			`worker.sidekiq_retry_in_block? && retry_in(worker, count) \|\| seconds_to_delay(count)`
			`end`

Avoid calling processor during hard shutdown, fixes #997 2013-06-11 01:20:15 -04:00			`# delayed_job uses the same basic formula`
Refactor sidekiq_retry_in impl 2013-06-26 00:10:46 -04:00			`def seconds_to_delay(count)`
			`(count ** 4) + 15 + (rand(30)*(count+1))`
			`end`
Revert "Refactor into a class" This reverts commit 9583f090a80236db36c2860a80fda092dd5f453c. 2013-06-25 23:41:36 -04:00
Refactor sidekiq_retry_in impl 2013-06-26 00:10:46 -04:00			`def retry_in(worker, count)`
			`begin`
			`worker.sidekiq_retry_in_block.call(count)`
			`rescue Exception => e`
			logger.error { "Failure scheduling retry using the defined `sidekiq_retry_in` in #{worker.class.name}, falling back to default: #{e.message}"}
			`nil`
Revert "Refactor into a class" This reverts commit 9583f090a80236db36c2860a80fda092dd5f453c. 2013-06-25 23:41:36 -04:00			`end`
Makes retry time calculation easier to monkey-patch by changing it from a constant proc to a method. 2013-03-05 03:22:31 -05:00			`end`
Refactor sidekiq_retry_in impl 2013-06-26 00:10:46 -04:00
HOT new automatic retry feature. Needs testing. 2012-03-17 16:41:53 -04:00			`end`
			`end`
			`end`
			`end`