How To Use Linux epoll with Python

Posted on May 5, 2009. Filed under: Linux, Python | Tags: , , , , |

Contents

Introduction

As of version 2.6, Python includes an API for accessing the Linux epoll library. This article uses Python 3 examples to briefly demonstrate the API. Questions and feedback are welcome.

Blocking Socket Programming Examples

Example 1 is a simple Python 3.0 server that listens on port 8080 for an HTTP request message, prints it to the console, and sends an HTTP response message back to the client.

  • Line 9: Create the server socket.
  • Line 10: Permits the bind() in line 11 even if another program was recently listening on the same port. Otherwise this program could not run until a minute or two after the previous program using that port had finished.
  • Line 11: Bind the server socket to port 8080 of all available IPv4 addresses on this machine.
  • Line 12: Tell the server socket to start accepting incoming connections from clients.
  • Line 14: The program will stop here until a connection is received. When this happens, the server socket will create a new socket on this machine that is used to talk to the client. This new socket is represented by the clientconnection object returned from the accept() call. The address object indicates the IP address and port number at the other end of the connection.
  • Lines 15-17: Assemble the data being transmitted by the client until a complete HTTP request has been transmitted. The HTTP protocol is described at HTTP Made Easy.
  • Line 18: Print the request to the console, in order to verify correct operation.
  • Line 19: Send the response to the client.
  • Lines 20-22: Close the connection to the client as well as the listening server socket.

The official HOWTO has a more detailed description of socket programming with Python.

Example 1 (All examples use Python 3)

 1  import socket
 2
 3  EOL1 = b'\n\n'
 4  EOL2 = b'\n\r\n'
 5  response  = b'HTTP/1.0 200 OK\r\nDate: Mon, 1 Jan 1996 01:01:01 GMT\r\n'
 6  response += b'Content-Type: text/plain\r\nContent-Length: 13\r\n\r\n'
 7  response += b'Hello, world!'
 8
 9  serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
10  serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
11  serversocket.bind(('0.0.0.0', 8080))
12  serversocket.listen(1)
13
14  connectiontoclient, address = serversocket.accept()
15  request = b''
16  while EOL1 not in request and EOL2 not in request:
17     request += connectiontoclient.recv(1024)
18  print(request.decode())
19  connectiontoclient.send(response)
20  connectiontoclient.close()
21
22  serversocket.close()

Example 2 adds a loop in line 15 to repeatedly processes client connections until interrupted by the user (e.g. with a keyboard interrupt). This illustrates more clearly that the server socket is never used to exchange data with the client. Rather, it accepts a connection from a client, and then creates a new socket on the server machine that is used to communicate with the client.

The finally statement block in lines 23-24 ensures that the listening server socket is always closed, even if an exception occurs.

Example 2

 1  import socket
 2
 3  EOL1 = b'\n\n'
 4  EOL2 = b'\n\r\n'
 5  response  = b'HTTP/1.0 200 OK\r\nDate: Mon, 1 Jan 1996 01:01:01 GMT\r\n'
 6  response += b'Content-Type: text/plain\r\nContent-Length: 13\r\n\r\n'
 7  response += b'Hello, world!'
 8
 9  serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
10  serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
11  serversocket.bind(('0.0.0.0', 8080))
12  serversocket.listen(1)
13
14  try:
15     while True:
16        connectiontoclient, address = serversocket.accept()
17        request = b''
18        while EOL1 not in request and EOL2 not in request:
19            request += connectiontoclient.recv(1024)
20        print('-'*40 + '\n' + request.decode()[:-2])
21        connectiontoclient.send(response)
22        connectiontoclient.close()
23  finally:
24     serversocket.close()

Benefits of Asynchronous Sockets and Linux epoll

The sockets shown in Example 2 are called blocking sockets, because the Python program stops running until an event occurs. The accept() call in line 16 blocks until a connection has been received from a client. The recv() call in line 19 blocks until data has been received from the client (or until there is no more data to receive). The send() call in line 21 blocks until all of the data being returned to the client has been queued by Linux in preparation for transmission.

When a program uses blocking sockets it often uses one thread (or even a dedicated process) to carry out the communication on each of those sockets. The main program thread will contain the listening server socket which accepts incoming connections from clients. It will accept these connections one at a time, passing the newly created socket off to a separate thread which will then interact with the client. Because each of these threads only communicates with one client, it is ok if it is blocked from proceeding at certain points. This blockage does not prohibit any of the other threads from carrying out their respective tasks.

The use of blocking sockets with multiple threads results in straightforward code, but comes with a number of drawbacks. It can be difficult to ensure the threads cooperate appropriately when sharing resources. And this style of programming can be less efficient on computers with only one CPU.

The C10K Problem discusses some of the alternatives for handling multiple concurrent sockets. One is the use of asynchronous sockets. These sockets don’t block until some event occurs. Instead, the program performs an action on an asynchronous socket and is immediately notified as to whether that action succeeded or failed. This information allows the program to decide how to proceed. Since asynchronous sockets are non-blocking, there is no need for multiple threads of execution. All work may be done in a single thread. This single-threaded approach comes with its own challenges, but can be a good choice for many programs. It can also be combined with the multi-threaded approach: asynchronous sockets using a single thread can be used for the networking component of a server, and threads can be used to access other blocking resources, e.g. databases.

Linux 2.6 has a number of mechanisms for managing asynchronous sockets, three of which are exposed by the Python API’s select, poll and epoll.  epoll and poll are better than select because the Python program does not have to inspect each socket for events of interest. Instead it can rely on the operating system to tell it which sockets may have these events. And epoll is better than poll because it does not require the operating system to inspect all sockets for events of interest each time it is queried by the Python program. Rather Linux tracks these events as they occur, and returns a list when queried by Python. So epoll is a more efficient and scalable mechanism for large numbers (thousands) of concurrent socket connections, as shown in these graphs.

Asynchronous Socket Programming Examples with epoll

Programs using epoll often perform actions in this sequence:

  1. Create an epoll object
  2. Tell the epoll object to monitor specific events on specific sockets
  3. Ask the epoll object which sockets may have had the specified event since the last query
  4. Perform some action on those sockets
  5. Tell the epoll object to modify the list of sockets and/or events to monitor
  6. Repeat steps 3 through 5 until finished
  7. Destroy the epoll object

Example 3 duplicates the functionality of Example 2 while using asynchronous sockets. The program is more complex because a single thread is interleaving the communication with multiple clients.

  • Line 1: The select module contains the epoll functionality.
  • Line 13: Since sockets are blocking by default, this is necessary to use non-blocking (asynchronous) mode.
  • Line 15: Create an epoll object.
  • Line 16: Register interest in read events on the server socket. A read event will occur any time the server socket accepts a socket connection.
  • Line 19: The connection dictionary maps file descriptors (integers) to their corresponding network connection objects.
  • Line 21: Query the epoll object to find out if any events of interest may have occurred. The parameter “1” signifies that we are willing to wait up to one second for such an event to occur. If any events of interest occurred prior to this query, the query will return immediately with a list of those events.
  • Line 22: Events are returned as a sequence of (fileno, event code) tuples. fileno is a synonym for file descriptor and is always an integer.
  • Line 23: If a read event occurred on the socket server, then a new socket connection may have been created.
  • Line 25: Set new socket to non-blocking mode.
  • Line 26: Register interest in read (EPOLLIN) events for the new socket.
  • Line 31: If a read event occurred then read new data sent from the client.
  • Line 33: Once the complete request has been received, then unregister interest in read events and register interest in write (EPOLLOUT) events. Write events will occur when it is possible to send response data back to the client.
  • Line 34: Print the complete request, demonstrating that although communication with clients is interleaved this data can be assembled and processed as a whole message.
  • Line 35: If a write event occurred on a client socket, it’s able to accept new data to send to the client.
  • Lines 36-38: Send the response data a bit at a time until the complete response has been delivered to the operating system for transmission.
  • Line 39: Once the complete response has been sent, disable interest in further read or write events.
  • Line 40: A socket shutdown is optional if a connection is closed explicitly. This example program uses it in order to cause the client to shutdown first. The shutdown call informs the client socket that no more data should be sent or received and will cause a well-behaved client to close the socket connection from it’s end.
  • Line 41: The HUP (hang-up) event indicates that the client socket has been disconnected (i.e. closed), so this end is closed as well. There is no need to register interest in HUP events. They are always indicated on sockets that are registered with the epoll object.
  • Line 42: Unregister interest in this socket connection.
  • Line 43: Close the socket connection.
  • Lines 18-45: The try-catch block is included because the example program will most likely be interrupted by a KeyboardInterrupt exception
  • Lines 46-48: Open socket connections don’t need to be closed since Python will close them when the program terminates. They’re included as a matter of good form.
Example 3

 1  import socket, select
 2
 3  EOL1 = b'\n\n'
 4  EOL2 = b'\n\r\n'
 5  response  = b'HTTP/1.0 200 OK\r\nDate: Mon, 1 Jan 1996 01:01:01 GMT\r\n'
 6  response += b'Content-Type: text/plain\r\nContent-Length: 13\r\n\r\n'
 7  response += b'Hello, world!'
 8
 9  serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
10  serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
11  serversocket.bind(('0.0.0.0', 8080))
12  serversocket.listen(1)
13  serversocket.setblocking(0)
14
15  epoll = select.epoll()
16  epoll.register(serversocket.fileno(), select.EPOLLIN)
17
18  try:
19     connections = {}; requests = {}; responses = {}
20     while True:
21        events = epoll.poll(1)
22        for fileno, event in events:
23           if fileno == serversocket.fileno():
24              connection, address = serversocket.accept()
25              connection.setblocking(0)
26              epoll.register(connection.fileno(), select.EPOLLIN)
27              connections[connection.fileno()] = connection
28              requests[connection.fileno()] = b''
29              responses[connection.fileno()] = response
30           elif event & select.EPOLLIN:
31              requests[fileno] += connections[fileno].recv(1024)
32              if EOL1 in requests[fileno] or EOL2 in requests[fileno]:
33                 epoll.modify(fileno, select.EPOLLOUT)
34                 print('-'*40 + '\n' + requests[fileno].decode()[:-2])
35           elif event & select.EPOLLOUT:
36              byteswritten = connections[fileno].send(responses[fileno])
37              responses[fileno] = responses[fileno][byteswritten:]
38              if len(responses[fileno]) == 0:
39                 epoll.modify(fileno, 0)
40                 connections[fileno].shutdown(socket.SHUT_RDWR)
41           elif event & select.EPOLLHUP:
42              epoll.unregister(fileno)
43              connections[fileno].close()
44              del connections[fileno]
45  finally:
46     epoll.unregister(serversocket.fileno())
47     epoll.close()
48     serversocket.close()

epoll has two modes of operation, called edge-triggered and level-triggered. In the edge-triggered mode of operation a call to epoll.poll() will return an event on a socket only once after the read or write event occurred on that socket. The calling program must process all of the data associated with that event without further notifications on subsequent calls to epoll.poll(). When the data from a particular event is exhausted, additional attempts to operate on the socket will cause an exception. Conversely, in the level-triggered mode of operation, repeated calls to epoll.poll() will result in repeated notifications of the event of interest, until all data associated with that event has been processed. No exceptions normally occur in level-triggered mode.

For example, suppose a server socket has been registered with an epoll object for read events. In edge-triggered mode the program would need to accept() new socket connections until a socket.error exception occurs. Whereas in the level-triggered mode of operation a single accept() call can be made and then the epoll object can be queried again for new events on the server socket indicating that additional calls to accept() should be made.

Example 3 used level-triggered mode, which is the default mode of operation. Example 4 demonstrates how to use edge-triggered mode. In Example 4, lines 25, 36 and 45 introduce loops that run until an exception occurs (or all data is otherwise known to be handled). Lines 32, 38 and 48 catch the expected socket exceptions. Finally, lines 16, 28, 41 and 51 add the EPOLLET mask which is used to set edge-triggered mode.

Example 4

 1  import socket, select
 2
 3  EOL1 = b'\n\n'
 4  EOL2 = b'\n\r\n'
 5  response  = b'HTTP/1.0 200 OK\r\nDate: Mon, 1 Jan 1996 01:01:01 GMT\r\n'
 6  response += b'Content-Type: text/plain\r\nContent-Length: 13\r\n\r\n'
 7  response += b'Hello, world!'
 8
 9  serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
10  serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
11  serversocket.bind(('0.0.0.0', 8080))
12  serversocket.listen(1)
13  serversocket.setblocking(0)
14
15  epoll = select.epoll()
16  epoll.register(serversocket.fileno(), select.EPOLLIN | select.EPOLLET)
17
18  try:
19     connections = {}; requests = {}; responses = {}
20     while True:
21        events = epoll.poll(1)
22        for fileno, event in events:
23           if fileno == serversocket.fileno():
24              try:
25                 while True:
26                    connection, address = serversocket.accept()
27                    connection.setblocking(0)
28                    epoll.register(connection.fileno(), select.EPOLLIN | select.EPOLLET)
29                    connections[connection.fileno()] = connection
30                    requests[connection.fileno()] = b''
31                    responses[connection.fileno()] = response
32              except socket.error:
33                 pass
34           elif event & select.EPOLLIN:
35              try:
36                 while True:
37                    requests[fileno] += connections[fileno].recv(1024)
38              except socket.error:
39                 pass
40              if EOL1 in requests[fileno] or EOL2 in requests[fileno]:
41                 epoll.modify(fileno, select.EPOLLOUT | select.EPOLLET)
42                 print('-'*40 + '\n' + requests[fileno].decode()[:-2])
43           elif event & select.EPOLLOUT:
44              try:
45                 while len(responses[fileno]) > 0:
46                    byteswritten = connections[fileno].send(responses[fileno])
47                    responses[fileno] = responses[fileno][byteswritten:]
48              except socket.error:
49                 pass
50              if len(responses[fileno]) == 0:
51                 epoll.modify(fileno, select.EPOLLET)
52                 connections[fileno].shutdown(socket.SHUT_RDWR)
53           elif event & select.EPOLLHUP:
54              epoll.unregister(fileno)
55              connections[fileno].close()
56              del connections[fileno]
57  finally:
58     epoll.unregister(serversocket.fileno())
59     epoll.close()
60     serversocket.close()

Since they’re similar, level-triggered mode is often used when porting an application that was using the select or poll mechanisms, while edge-triggered mode may be used when the programmer doesn’t need or want as much assistance from the operating system in managing event state.

In addition to these two modes of operation, sockets may also be registered with the epoll object using the EPOLLONESHOT event mask. When this option is used, the registered event is only valid for one call to epoll.poll(), after which time it is automatically removed from the list of registered sockets being monitored.

Performance Considerations

Listen Backlog Queue Size

In Examples 1-4, line 12 has shown a call to the serversocket.listen() method. The parameter for this method is the listen backlog queue size. It tells the operating system how many TCP/IP connections to accept and place on the backlog queue before they are accepted by the Python program. Each time the Python program calls accept() on the server socket, one of the connections is removed from the queue and that slot can be used for another incoming connection. If the queue is full, new incoming connections are silently ignored causing unnecessary delays on the client side of the network connection. A production server usually handles tens or hundreds of simultaneous connections, so a value of 1 will usually be inadequate. For example, when using ab to perform load testing against these sample programs with 100 concurrent HTTP 1.0 clients, any backlog value less than 50 would often produce performance degradation.

TCP Options

The TCP_CORK option can be used to “bottle up” messages until they are ready to send. This option, illustrated in lines 34 and 40 of Examples 5, might be a good option to use for an HTTP server using HTTP/1.1 pipelining.

Example 5

 1  import socket, select
 2
 3  EOL1 = b'\n\n'
 4  EOL2 = b'\n\r\n'
 5  response  = b'HTTP/1.0 200 OK\r\nDate: Mon, 1 Jan 1996 01:01:01 GMT\r\n'
 6  response += b'Content-Type: text/plain\r\nContent-Length: 13\r\n\r\n'
 7  response += b'Hello, world!'
 8
 9  serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
10  serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
11  serversocket.bind(('0.0.0.0', 8080))
12  serversocket.listen(1)
13  serversocket.setblocking(0)
14
15  epoll = select.epoll()
16  epoll.register(serversocket.fileno(), select.EPOLLIN)
17
18 try:
19     connections = {}; requests = {}; responses = {}
20     while True:
21        events = epoll.poll(1)
22        for fileno, event in events:
23           if fileno == serversocket.fileno():
24              connection, address = serversocket.accept()
25              connection.setblocking(0)
26              epoll.register(connection.fileno(), select.EPOLLIN)
27              connections[connection.fileno()] = connection
28              requests[connection.fileno()] = b''
29              responses[connection.fileno()] = response
30           elif event & select.EPOLLIN:
31              requests[fileno] += connections[fileno].recv(1024)
32              if EOL1 in requests[fileno] or EOL2 in requests[fileno]:
33                 epoll.modify(fileno, select.EPOLLOUT)
34                 connections[fileno].setsockopt(socket.IPPROTO_TCP, socket.TCP_CORK, 1)
35                 print('-'*40 + '\n' + requests[fileno].decode()[:-2])
36           elif event & select.EPOLLOUT:
37              byteswritten = connections[fileno].send(responses[fileno])
38              responses[fileno] = responses[fileno][byteswritten:]
39              if len(responses[fileno]) == 0:
40                 connections[fileno].setsockopt(socket.IPPROTO_TCP, socket.TCP_CORK, 0)
41                 epoll.modify(fileno, 0)
42                 connections[fileno].shutdown(socket.SHUT_RDWR)
43           elif event & select.EPOLLHUP:
44              epoll.unregister(fileno)
45              connections[fileno].close()
46              del connections[fileno]
47  finally:
48     epoll.unregister(serversocket.fileno())
49     epoll.close()
50     serversocket.close()

On the other hand, the TCP_NODELAY option can be used to tell the operating system that any data passed to socket.send() should immediately be sent to the client without being buffered by the operating system. This option, illustrated in line 14 of Example 6, might be a good option to use for an SSH client or other “real-time” application.

Example 6

 1  import socket, select
 2
 3  EOL1 = b'\n\n'
 4  EOL2 = b'\n\r\n'
 5  response  = b'HTTP/1.0 200 OK\r\nDate: Mon, 1 Jan 1996 01:01:01 GMT\r\n'
 6  response += b'Content-Type: text/plain\r\nContent-Length: 13\r\n\r\n'
 7  response += b'Hello, world!'
 8
 9  serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
10  serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
11  serversocket.bind(('0.0.0.0', 8080))
12  serversocket.listen(1)
13  serversocket.setblocking(0)
14  serversocket.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
15
16  epoll = select.epoll()
17  epoll.register(serversocket.fileno(), select.EPOLLIN)
18
19 try:
20     connections = {}; requests = {}; responses = {}
21     while True:
22        events = epoll.poll(1)
23        for fileno, event in events:
24           if fileno == serversocket.fileno():
25              connection, address = serversocket.accept()
26              connection.setblocking(0)
27              epoll.register(connection.fileno(), select.EPOLLIN)
28              connections[connection.fileno()] = connection
29              requests[connection.fileno()] = b''
30              responses[connection.fileno()] = response
31           elif event & select.EPOLLIN:
32              requests[fileno] += connections[fileno].recv(1024)
33              if EOL1 in requests[fileno] or EOL2 in requests[fileno]:
34                 epoll.modify(fileno, select.EPOLLOUT)
35                 print('-'*40 + '\n' + requests[fileno].decode()[:-2])
36           elif event & select.EPOLLOUT:
37              byteswritten = connections[fileno].send(responses[fileno])
38              responses[fileno] = responses[fileno][byteswritten:]
39              if len(responses[fileno]) == 0:
40                 epoll.modify(fileno, 0)
41                 connections[fileno].shutdown(socket.SHUT_RDWR)
42           elif event & select.EPOLLHUP:
43              epoll.unregister(fileno)
44              connections[fileno].close()
45              del connections[fileno]
46  finally:
47     epoll.unregister(serversocket.fileno())
48     epoll.close()
49     serversocket.close()

Source Code

The examples on this page are in the public domain and available for download.

References:

http://scotdoyle.com/python-epoll-howto.html

http://mail.python.org/pipermail/python-dev/2003-August/037614.html

http://ionelmc.wordpress.com/2008/04/01/curious-difference-between-epoll-poll-kqueue-and-select/

http://ilab.cs.byu.edu/python/select/echoserver.html

http://docs.python.org/library/select.html

Advertisements

Make a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Liked it here?
Why not try sites on the blogroll...

%d bloggers like this: