1
0
Fork 0
mirror of https://github.com/puma/puma.git synced 2022-11-09 13:48:40 -05:00
puma--puma/test/rackup
Richard Schneeman 184e1510a9 [close #1802] Close listeners on SIGTERM
Currently when a SIGTERM is sent to a puma cluster, the signal is trapped, then sent to all children, it then waits for children to exit and then the parent process exits. The socket that accepts connections is only closed when the parent process calls `exit 0`. The problem with this flow is there is a period of time where there are no child processes to work on an incoming connection, however the socket is still open so clients can connect to it. When this happens, the client will connect, but the connection will be closed with no response. Instead, the desired behavior is for the connection from the client to be rejected. This allows the client to re-connect, or if there is a load balance between the client and the puma server, it allows the request to be routed to another node.

This PR fixes the existing behavior by manually closing the socket when SIGTERM is received before shutting down the workers/children processes. When the socket is closed, any incoming requests will fail to connect and they will be rejected, this is our desired behavior. Existing requests that are in-flight can still respond.

 ## Test


This behavior is quite difficult to test, you'll notice that the test is far longer than the code change. In this test we send an initial request to an endpoint that sleeps for 1 second. We then signal to other threads that they can continue. We send the parent process a SIGTERM, while simultaneously sending other requests. Some of these will happen after the SIGTERM is received by the server. When that happens we want none of the requests to get a `ECONNRESET` error, this would indicate the request was accepted but then closed. Instead we want `ECONNREFUSED`.

I ran this test in a loop for a few hours and it passes with my patch, it fails immediately if you remove the call to close the listeners.

```
$ while m test/test_integration.rb:235; do :; done
```

 ## Considerations

This PR only fixes the problem for "cluster" (i.e. multi-worker) mode. When trying to reproduce the test with single mode, on (removing the `-w 2` config) it already passes. This leads us to believe that either the bug does not exist in single threaded mode, or at the very least reproducing the bug via a test in the single threaded mode requires a different approach.

Co-authored-by: Danny Fallon <Danny.fallon.ie+github@gmail.com>
Co-authored-by:  Richard Schneeman <richard.schneeman+foo@gmail.com>
2019-05-30 15:16:51 -05:00
..
1second.ru [close #1802] Close listeners on SIGTERM 2019-05-30 15:16:51 -05:00
10seconds.ru Add integration test 2018-12-10 16:49:29 +03:00
hello-bind.ru sort configs/rackups/tests (#1268) 2017-04-11 14:08:18 -07:00
hello-delay.ru sort configs/rackups/tests (#1268) 2017-04-11 14:08:18 -07:00
hello-env.ru make test_helper no longer be loaded as a test (#1283) 2017-05-12 12:16:55 -07:00
hello-map.ru sort configs/rackups/tests (#1268) 2017-04-11 14:08:18 -07:00
hello-post.ru sort configs/rackups/tests (#1268) 2017-04-11 14:08:18 -07:00
hello-stuck-ci.ru Update test skips, use next_port 2019-02-20 14:15:55 -06:00
hello-stuck.ru sort configs/rackups/tests (#1268) 2017-04-11 14:08:18 -07:00
hello-tcp.ru sort configs/rackups/tests (#1268) 2017-04-11 14:08:18 -07:00
hello.ru sort configs/rackups/tests (#1268) 2017-04-11 14:08:18 -07:00
hijack.ru sort configs/rackups/tests (#1268) 2017-04-11 14:08:18 -07:00
hijack2.ru sort configs/rackups/tests (#1268) 2017-04-11 14:08:18 -07:00
lobster.ru sort configs/rackups/tests (#1268) 2017-04-11 14:08:18 -07:00
slow.ru sort configs/rackups/tests (#1268) 2017-04-11 14:08:18 -07:00