|Title:||Waiting for File Descriptor Events|
|Author:||Christopher Stawarz <firstname.lastname@example.org>|
|Discussions-To:||Python Web-SIG <email@example.com>|
This specification defines a set of extensions that allow a WSGI application to suspend its execution until an event occurs on a specified file descriptor.
The architecture of asynchronous (aka event driven) servers requires all I/O operations, including both interprocess and network communication, to be non-blocking. For a WSGI-compliant server, this requirement extends to all applications run on the server. However, the WSGI specification does not provide sufficient facilities for an application to ensure that its I/O is non-blocking. Specifically, it lacks a mechanism by which an application can suspend its execution until an arbitrary file descriptor (such as one belonging to a socket or pipe opened by the application) is ready for reading or writing. This specification defines a standard interface by which servers can provide such a mechanism to applications.
This specification introduces three new variables to the WSGI
x-wsgiorg.fdevent.writable are callable objects that accept
two positional arguments, one required and one optional. In the
following description, these arguments are given the names
timeout, but they are not required to have these names, and the
application must invoke the callables using positional arguments.
The first argument,
fd, is either an integer representing a file
descriptor or an object with a
fileno method that returns such an
integer. The set of acceptable file descriptors is defined to be
those accepted by
select.select. (Note that this set is platform
dependent: only sockets are allowed on Windows, whereas sockets,
pipes, and files are acceptable on Unix-like systems.) The second,
timeout, is either
None or a floating-point
value in seconds. If omitted, it defaults to
x-wsgiorg.fdevent.writable return the empty string (
which must be yielded by the application iterable to the server.
(The result of calling
x-wsgiorg.fdevent.writable and yielding a non-empty string,
or making multiple calls to
x-wsgiorg.fdevent.writable before yielding the empty
string, is undefined.) The server then suspends execution of the
application until one of the following conditions is met:
x-wsgiorg.fdevent.readable) or writing (if the application called
timeoutseconds have elapsed without the desired file descriptor becoming readable (if the application called
x-wsgiorg.fdevent.readable) or writable (if the application called
x-wsgiorg.fdevent.writable), unless the value of
None, in which case the wait will never timeout.
Put another way, if the application calls
x-wsgiorg.fdevent.readable and yields the empty string, it
will be suspended until
return. If the application calls
and yields the empty string, it will be suspended until
select.select(,[fd],[fd],timeout) would return.
x-wsgiorg.fdevent.timeout is an object whose
truth value can be changed by the server. (For example, it could be a
list instance, whose truth value is false when empty, true
timeout seconds elapse without the desired file
descriptor event occurring,
be true when the application resumes; otherwise, it will be false.
The truth value of
x-wsgiorg.fdevent.timeout when the
application is first started or after it yields each response-body
string is undefined.
The server may use any technique it desires to detect events on an application’s file descriptors. (Most likely, it will add them to the same event loop that it uses for accepting new client connections, receiving requests, and sending responses.)
While technically outside the scope of this specification, the
application’s input stream (
wsgi.input) is another
source of potentially blocking I/O that deserves mention.
The methods provided by the input stream follow the semantics of the
corresponding methods of the
file class. In particular, each of
these methods can invoke the underlying I/O function (in this case,
recv on the socket connected to the client) more than once,
without giving the application the opportunity to check whether each
invocation will block. Although authors of asynchronous servers may
be tempted to provide a non-standard input stream that supports
on-demand, non-blocking reads, such an input stream would be
incompatible with WSGI middleware.
In order to avoid these problems, it is strongly recommended that asynchronous servers pre-read the entire request body (to an in-memory buffer or temporary file) before invoking the application, either by default or as a configurable option. Doing so will ensure that the input stream is compatible with middleware and that reads from it will not block waiting for data from the client.
The following application acts as a proxy to python.org. It uses a
to perform the outgoing HTTP request in a non-blocking fashion. When
CurlMulti.perform() method detects that its next I/O
operation would block, it returns control to the application, which
then yields until the file descriptor of interest becomes readable or
writable as required. If the descriptor is not ready after one
second, the application sends a
504 Gateway Timeout response to
the client and terminates:
def pyorg_proxy(environ, start_response): result = StringIO() c = pycurl.Curl() c.setopt(pycurl.URL, 'http://python.org' + environ['PATH_INFO']) c.setopt(pycurl.WRITEFUNCTION, result.write) m = pycurl.CurlMulti() m.add_handle(c) while True: while True: ret, num_handles = m.perform() if ret != pycurl.E_CALL_MULTI_PERFORM: break if not num_handles: break read, write, exc = m.fdset() if read: yield environ['x-wsgiorg.fdevent.readable'](read, 1.0) else: yield environ['x-wsgiorg.fdevent.writable'](write, 1.0) if environ['x-wsgiorg.fdevent.timeout']: msg = 'The request to python.org timed out.' start_response('504 Gateway Timeout', [('Content-Type', 'text/plain'), ('Content-Length', str(len(msg)))]) yield msg return start_response('200 OK', [('Content-Type', 'application/octet-stream'), ('Content-Length', str(result.len))]) yield result.getvalue()
The following adapter allows an application that uses the
x-wsgiorg.fdevent extensions to run on a server that does not
support them, without any modification to the application’s code:
def with_fdevent(application): def wrapper(environ, start_response): select_args = [None] def readable(fd, timeout=None): assert (not select_args) select_args = ([fd], , [fd], timeout) return '' def writable(fd, timeout=None): assert (not select_args) select_args = (, [fd], [fd], timeout) return '' environ['x-wsgiorg.fdevent.readable'] = readable environ['x-wsgiorg.fdevent.writable'] = writable timeout = False class TimeoutWrapper(object): def __nonzero__(self): return timeout environ['x-wsgiorg.fdevent.timeout'] = TimeoutWrapper() for result in application(environ, start_response): assert (not (result and select_args)) if result or (not select_args): yield result else: ready = select.select(*select_args) timeout = (ready == (, , )) select_args = None return wrapper
x-wsgiorg.fdevent.writablemust pass through any intervening middleware and be detected by the server. Although WSGI explicitly requires middleware to relay such strings to the server (see Middleware Handling of Block Boundaries), some components may not, making them incompatible with this specification.
Some third-party libraries (such as PycURL) provide non-blocking interfaces that may need to monitor multiple file descriptors for events simultaneously. Since this specification allows an application to wait on only one file descriptor at a time, application authors may find it difficult or impossible to use such libraries, or they may be limited to a subset of the libraries’ capabilities.
Although this specification could be extended to include an interface for waiting on multiple file descriptors, it is unclear whether it would be easy (or even possible) for all servers to implement it. Also, the appropriate behavior for a multi-descriptor wait is not obvious. (Should the application be resumed when a single descriptor is ready? All of them? Some minimum number?)