341 lines
15 KiB
Plaintext
341 lines
15 KiB
Plaintext
This specification aims to formalize the Rack protocol. You
|
||
can (and should) use Rack::Lint to enforce it.
|
||
|
||
When you develop middleware, be sure to add a Lint before and
|
||
after to catch all mistakes.
|
||
|
||
= Rack applications
|
||
|
||
A Rack application is a Ruby object (not a class) that
|
||
responds to +call+.
|
||
It takes exactly one argument, the *environment*
|
||
and returns a non-frozen Array of exactly three values:
|
||
The *status*,
|
||
the *headers*,
|
||
and the *body*.
|
||
|
||
== The Environment
|
||
|
||
The environment must be an unfrozen instance of Hash that includes
|
||
CGI-like headers. The Rack application is free to modify the
|
||
environment.
|
||
|
||
The environment is required to include these variables
|
||
(adopted from {PEP 333}[https://peps.python.org/pep-0333/]), except when they'd be empty, but see
|
||
below.
|
||
<tt>REQUEST_METHOD</tt>:: The HTTP request method, such as
|
||
"GET" or "POST". This cannot ever
|
||
be an empty string, and so is
|
||
always required.
|
||
<tt>SCRIPT_NAME</tt>:: The initial portion of the request
|
||
URL's "path" that corresponds to the
|
||
application object, so that the
|
||
application knows its virtual
|
||
"location". This may be an empty
|
||
string, if the application corresponds
|
||
to the "root" of the server.
|
||
<tt>PATH_INFO</tt>:: The remainder of the request URL's
|
||
"path", designating the virtual
|
||
"location" of the request's target
|
||
within the application. This may be an
|
||
empty string, if the request URL targets
|
||
the application root and does not have a
|
||
trailing slash. This value may be
|
||
percent-encoded when originating from
|
||
a URL.
|
||
<tt>QUERY_STRING</tt>:: The portion of the request URL that
|
||
follows the <tt>?</tt>, if any. May be
|
||
empty, but is always required!
|
||
<tt>SERVER_NAME</tt>:: When combined with <tt>SCRIPT_NAME</tt> and
|
||
<tt>PATH_INFO</tt>, these variables can be
|
||
used to complete the URL. Note, however,
|
||
that <tt>HTTP_HOST</tt>, if present,
|
||
should be used in preference to
|
||
<tt>SERVER_NAME</tt> for reconstructing
|
||
the request URL.
|
||
<tt>SERVER_NAME</tt> can never be an empty
|
||
string, and so is always required.
|
||
<tt>SERVER_PORT</tt>:: An optional +Integer+ which is the port the
|
||
server is running on. Should be specified if
|
||
the server is running on a non-standard port.
|
||
<tt>SERVER_PROTOCOL</tt>:: A string representing the HTTP version used
|
||
for the request.
|
||
<tt>HTTP_</tt> Variables:: Variables corresponding to the
|
||
client-supplied HTTP request
|
||
headers (i.e., variables whose
|
||
names begin with <tt>HTTP_</tt>). The
|
||
presence or absence of these
|
||
variables should correspond with
|
||
the presence or absence of the
|
||
appropriate HTTP header in the
|
||
request. See
|
||
{RFC3875 section 4.1.18}[https://tools.ietf.org/html/rfc3875#section-4.1.18]
|
||
for specific behavior.
|
||
In addition to this, the Rack environment must include these
|
||
Rack-specific variables:
|
||
<tt>rack.url_scheme</tt>:: +http+ or +https+, depending on the
|
||
request URL.
|
||
<tt>rack.input</tt>:: See below, the input stream.
|
||
<tt>rack.errors</tt>:: See below, the error stream.
|
||
<tt>rack.hijack?</tt>:: See below, if present and true, indicates
|
||
that the server supports partial hijacking.
|
||
<tt>rack.hijack</tt>:: See below, if present, an object responding
|
||
to +call+ that is used to perform a full
|
||
hijack.
|
||
Additional environment specifications have approved to
|
||
standardized middleware APIs. None of these are required to
|
||
be implemented by the server.
|
||
<tt>rack.session</tt>:: A hash-like interface for storing
|
||
request session data.
|
||
The store must implement:
|
||
store(key, value) (aliased as []=);
|
||
fetch(key, default = nil) (aliased as []);
|
||
delete(key);
|
||
clear;
|
||
to_hash (returning unfrozen Hash instance);
|
||
<tt>rack.logger</tt>:: A common object interface for logging messages.
|
||
The object must implement:
|
||
info(message, &block)
|
||
debug(message, &block)
|
||
warn(message, &block)
|
||
error(message, &block)
|
||
fatal(message, &block)
|
||
<tt>rack.multipart.buffer_size</tt>:: An Integer hint to the multipart parser as to what chunk size to use for reads and writes.
|
||
<tt>rack.multipart.tempfile_factory</tt>:: An object responding to #call with two arguments, the filename and content_type given for the multipart form field, and returning an IO-like object that responds to #<< and optionally #rewind. This factory will be used to instantiate the tempfile for each multipart form file upload field, rather than the default class of Tempfile.
|
||
The server or the application can store their own data in the
|
||
environment, too. The keys must contain at least one dot,
|
||
and should be prefixed uniquely. The prefix <tt>rack.</tt>
|
||
is reserved for use with the Rack core distribution and other
|
||
accepted specifications and must not be used otherwise.
|
||
|
||
The <tt>SERVER_PORT</tt> must be an Integer if set.
|
||
The <tt>SERVER_NAME</tt> must be a valid authority as defined by RFC7540.
|
||
The <tt>HTTP_HOST</tt> must be a valid authority as defined by RFC7540.
|
||
The <tt>SERVER_PROTOCOL</tt> must match the regexp <tt>HTTP/\d(\.\d)?</tt>.
|
||
If the <tt>HTTP_VERSION</tt> is present, it must equal the <tt>SERVER_PROTOCOL</tt>.
|
||
The environment must not contain the keys
|
||
<tt>HTTP_CONTENT_TYPE</tt> or <tt>HTTP_CONTENT_LENGTH</tt>
|
||
(use the versions without <tt>HTTP_</tt>).
|
||
The CGI keys (named without a period) must have String values.
|
||
If the string values for CGI keys contain non-ASCII characters,
|
||
they should use ASCII-8BIT encoding.
|
||
There are the following restrictions:
|
||
* <tt>rack.url_scheme</tt> must either be +http+ or +https+.
|
||
* There must be a valid input stream in <tt>rack.input</tt>.
|
||
* There must be a valid error stream in <tt>rack.errors</tt>.
|
||
* There may be a valid hijack callback in <tt>rack.hijack</tt>
|
||
* The <tt>REQUEST_METHOD</tt> must be a valid token.
|
||
* The <tt>SCRIPT_NAME</tt>, if non-empty, must start with <tt>/</tt>
|
||
* The <tt>PATH_INFO</tt>, if non-empty, must start with <tt>/</tt>
|
||
* The <tt>CONTENT_LENGTH</tt>, if given, must consist of digits only.
|
||
* One of <tt>SCRIPT_NAME</tt> or <tt>PATH_INFO</tt> must be
|
||
set. <tt>PATH_INFO</tt> should be <tt>/</tt> if
|
||
<tt>SCRIPT_NAME</tt> is empty.
|
||
<tt>SCRIPT_NAME</tt> never should be <tt>/</tt>, but instead be empty.
|
||
<tt>rack.response_finished</tt>:: An array of callables run by the server after the response has been
|
||
processed. This would typically be invoked after sending the response to the client, but it could also be
|
||
invoked if an error occurs while generating the response or sending the response; in that case, the error
|
||
argument will be a subclass of +Exception+.
|
||
The callables are invoked with +env, status, headers, error+ arguments and should not raise any
|
||
exceptions. They should be invoked in reverse order of registration.
|
||
|
||
=== The Input Stream
|
||
|
||
The input stream is an IO-like object which contains the raw HTTP
|
||
POST data.
|
||
When applicable, its external encoding must be "ASCII-8BIT" and it
|
||
must be opened in binary mode, for Ruby 1.9 compatibility.
|
||
The input stream must respond to +gets+, +each+, and +read+.
|
||
* +gets+ must be called without arguments and return a string,
|
||
or +nil+ on EOF.
|
||
* +read+ behaves like IO#read.
|
||
Its signature is <tt>read([length, [buffer]])</tt>.
|
||
|
||
If given, +length+ must be a non-negative Integer (>= 0) or +nil+,
|
||
and +buffer+ must be a String and may not be nil.
|
||
|
||
If +length+ is given and not nil, then this method reads at most
|
||
+length+ bytes from the input stream.
|
||
|
||
If +length+ is not given or nil, then this method reads
|
||
all data until EOF.
|
||
|
||
When EOF is reached, this method returns nil if +length+ is given
|
||
and not nil, or "" if +length+ is not given or is nil.
|
||
|
||
If +buffer+ is given, then the read data will be placed
|
||
into +buffer+ instead of a newly created String object.
|
||
* +each+ must be called without arguments and only yield Strings.
|
||
* +close+ can be called on the input stream to indicate that the
|
||
any remaining input is not needed.
|
||
|
||
=== The Error Stream
|
||
|
||
The error stream must respond to +puts+, +write+ and +flush+.
|
||
* +puts+ must be called with a single argument that responds to +to_s+.
|
||
* +write+ must be called with a single argument that is a String.
|
||
* +flush+ must be called without arguments and must be called
|
||
in order to make the error appear for sure.
|
||
* +close+ must never be called on the error stream.
|
||
|
||
=== Hijacking
|
||
|
||
The hijacking interfaces provides a means for an application to take
|
||
control of the HTTP connection. There are two distinct hijack
|
||
interfaces: full hijacking where the application takes over the raw
|
||
connection, and partial hijacking where the application takes over
|
||
just the response body stream. In both cases, the application is
|
||
responsible for closing the hijacked stream.
|
||
|
||
Full hijacking only works with HTTP/1. Partial hijacking is functionally
|
||
equivalent to streaming bodies, and is still optionally supported for
|
||
backwards compatibility with older Rack versions.
|
||
|
||
==== Full Hijack
|
||
|
||
Full hijack is used to completely take over an HTTP/1 connection. It
|
||
occurs before any headers are written and causes the request to
|
||
ignores any response generated by the application.
|
||
|
||
It is intended to be used when applications need access to raw HTTP/1
|
||
connection.
|
||
|
||
If +rack.hijack+ is present in +env+, it must respond to +call+
|
||
and return an +IO+ instance which can be used to read and write
|
||
to the underlying connection using HTTP/1 semantics and
|
||
formatting.
|
||
|
||
==== Partial Hijack
|
||
|
||
Partial hijack is used for bi-directional streaming of the request and
|
||
response body. It occurs after the status and headers are written by
|
||
the server and causes the server to ignore the Body of the response.
|
||
|
||
It is intended to be used when applications need bi-directional
|
||
streaming.
|
||
|
||
If +rack.hijack?+ is present in +env+ and truthy,
|
||
an application may set the special response header +rack.hijack+
|
||
to an object that responds to +call+,
|
||
accepting a +stream+ argument.
|
||
|
||
After the response status and headers have been sent, this hijack
|
||
callback will be invoked with a +stream+ argument which follows the
|
||
same interface as outlined in "Streaming Body". Servers must
|
||
ignore the +body+ part of the response tuple when the
|
||
+rack.hijack+ response header is present. Using an empty +Array+
|
||
instance is recommended.
|
||
|
||
The special response header +rack.hijack+ must only be set
|
||
if the request +env+ has a truthy +rack.hijack?+.
|
||
== The Response
|
||
|
||
=== The Status
|
||
|
||
This is an HTTP status. It must be an Integer greater than or equal to
|
||
100.
|
||
|
||
=== The Headers
|
||
|
||
The headers must be a unfrozen Hash.
|
||
The header keys must be Strings.
|
||
Special headers starting "rack." are for communicating with the
|
||
server, and must not be sent back to the client.
|
||
The header must not contain a +Status+ key.
|
||
Header keys must conform to RFC7230 token specification, i.e. cannot
|
||
contain non-printable ASCII, DQUOTE or "(),/:;<=>?@[\]{}".
|
||
Header keys must not contain uppercase ASCII characters (A-Z).
|
||
Header values must be either a String instance,
|
||
or an Array of String instances,
|
||
such that each String instance must not contain characters below 037.
|
||
|
||
=== The content-type
|
||
|
||
There must not be a <tt>content-type</tt> header key when the +Status+ is 1xx,
|
||
204, or 304.
|
||
|
||
=== The content-length
|
||
|
||
There must not be a <tt>content-length</tt> header key when the
|
||
+Status+ is 1xx, 204, or 304.
|
||
|
||
=== The Body
|
||
|
||
The Body is typically an +Array+ of +String+ instances, an enumerable
|
||
that yields +String+ instances, a +Proc+ instance, or a File-like
|
||
object.
|
||
|
||
The Body must respond to +each+ or +call+. It may optionally respond
|
||
to +to_path+ or +to_ary+. A Body that responds to +each+ is considered
|
||
to be an Enumerable Body. A Body that responds to +call+ is considered
|
||
to be a Streaming Body.
|
||
|
||
A Body that responds to both +each+ and +call+ must be treated as an
|
||
Enumerable Body, not a Streaming Body. If it responds to +each+, you
|
||
must call +each+ and not +call+. If the Body doesn't respond to
|
||
+each+, then you can assume it responds to +call+.
|
||
|
||
The Body must either be consumed or returned. The Body is consumed by
|
||
optionally calling either +each+ or +call+.
|
||
Then, if the Body responds to +close+, it must be called to release
|
||
any resources associated with the generation of the body.
|
||
In other words, +close+ must always be called at least once; typically
|
||
after the web server has sent the response to the client, but also in
|
||
cases where the Rack application makes internal/virtual requests and
|
||
discards the response.
|
||
|
||
|
||
After calling +close+, the Body is considered closed and should not
|
||
be consumed again.
|
||
If the original Body is replaced by a new Body, the new Body must
|
||
also consume the original Body by calling +close+ if possible.
|
||
|
||
If the Body responds to +to_path+, it must return a +String+
|
||
path for the local file system whose contents are identical
|
||
to that produced by calling +each+; this may be used by the
|
||
server as an alternative, possibly more efficient way to
|
||
transport the response. The +to_path+ method does not consume
|
||
the body.
|
||
|
||
==== Enumerable Body
|
||
|
||
The Enumerable Body must respond to +each+.
|
||
It must only be called once.
|
||
It must not be called after being closed.
|
||
and must only yield String values.
|
||
|
||
The Body itself should not be an instance of String, as this will
|
||
break in Ruby 1.9.
|
||
|
||
Middleware must not call +each+ directly on the Body.
|
||
Instead, middleware can return a new Body that calls +each+ on the
|
||
original Body, yielding at least once per iteration.
|
||
|
||
If the Body responds to +to_ary+, it must return an +Array+ whose
|
||
contents are identical to that produced by calling +each+.
|
||
Middleware may call +to_ary+ directly on the Body and return a new
|
||
Body in its place. In other words, middleware can only process the
|
||
Body directly if it responds to +to_ary+. If the Body responds to both
|
||
+to_ary+ and +close+, its implementation of +to_ary+ must call
|
||
+close+.
|
||
|
||
==== Streaming Body
|
||
|
||
The Streaming Body must respond to +call+.
|
||
It must only be called once.
|
||
It must not be called after being closed.
|
||
It takes a +stream+ argument.
|
||
|
||
The +stream+ argument must implement:
|
||
<tt>read, write, <<, flush, close, close_read, close_write, closed?</tt>
|
||
|
||
The semantics of these IO methods must be a best effort match to
|
||
those of a normal Ruby IO or Socket object, using standard arguments
|
||
and raising standard exceptions. Servers are encouraged to simply
|
||
pass on real IO objects, although it is recognized that this approach
|
||
is not directly compatible with HTTP/2.
|
||
|
||
== Thanks
|
||
Some parts of this specification are adopted from {PEP 333 – Python Web Server Gateway Interface v1.0}[https://peps.python.org/pep-0333/]
|
||
I'd like to thank everyone involved in that effort.
|