Python

Python based web servers

Posted on August 28, 2013. Filed under: Python | Tags: , |

1. tornado web server: Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed. By using non-blocking network I/O, Tornado can scale to tens of thousands of open connections, making it ideal for long polling, WebSockets, and other applications that require a long-lived connection to each user.

Read Full Post | Make a Comment ( None so far )

python thread.error: can’t start new thread

Posted on March 5, 2010. Filed under: Python | Tags: , , , |

The python has thread.error such as:

File "/usr/lib/python2.5/threading.py", line 440, in start
_start_new_thread(self.__bootstrap, ())
thread.error: can't start new thread

The “can’t start new thread” error almost certainly due to the fact that you have already have too many threads running within your python process, and due to a resource limit of some kind the request to create a new thread is refused.

You should probably look at the number of threads you’re creating(maybe in the /proc/pid/); the maximum number you will be able to create will be determined by your environment, but it should be in the order of hundreds at least. (Can try ulimit to solve this issue)

It would probably be a good idea to re-think your architecture here; seeing as this is running asynchronously anyhow, perhaps you could use a pool of threads to fetch resources from another site instead of always starting up a thread for every request.

Another improvement to consider is your use of Thread.join and Thread.stop; this would probably be better accomplished by providing a timeout value to the constructor.

Reference:

http://stackoverflow.com/questions/1834919/error-cant-start-new-thread

http://adywicaksono.wordpress.com/2007/07/10/i-can-not-create-more-than-255-threads-on-linux-what-is-the-solutions/

http://www.afnog.org/archives/2008-September/004535.html

http://rcsg.rice.edu/rcsg/shared/ulimit.html

http://answers.google.com/answers/threadview/id/311442.html

http://ubuntuforums.org/archive/index.php/t-114071.html

Read Full Post | Make a Comment ( 2 so far )

Interesting videos

Posted on February 27, 2010. Filed under: Python | Tags: , |

1. Google I/O 2008 – Painless Python by Alex Martelli (Google)

http://www.youtube.com/watch?v=bDgD9whDfEY

http://www.youtube.com/watch?v=y7vwZ20SDzc

Read Full Post | Make a Comment ( None so far )

staticmethod vs classmethod in Python

Posted on January 19, 2010. Filed under: Python | Tags: , , |

Python’s static methods have a similar implementation as Java & C++. Static methods were not introduced into Python until version 2.2

Example of version 2.2 and higher implementation:


>>> class Foo:
...     def bar(arg):
...         Foo.arg = arg
...     bar = staticmethod(bar)
...
>>> Foo.bar('Hello World')
>>> Foo.arg
'Hello World'
>>> Foo().bar('Hello')
>>> Foo.arg
'Hello'

Static methods can be called either on the class (such as Foo.bar()) or on an instance (such as Foo().bar()). The instance is ignored except for its class.

In version 2.4, function decorator syntax was added, which allows another way to define a static method. If you are using 2.4 or above, this is the recommended way of creating a static method.

Example of version 2.4 and higher implementation:


>>> class Foo:
...     @staticmethod
...     def bar(arg):
...         Foo.arg = arg
...
>>> Foo.bar('Hello World')
>>> Foo.arg
'Hello World'
>>> Foo().bar('Hello')
>>> Foo.arg
'Hello'

If you are looking to do more advanced static methods, look into using classmethod instead of staticmethod. One of the differences between the two is that class method receives the class as implicit first argument, just like an instance method receives the instance. For further reading on class method, refer to the built in functions page.

———————————————————-

classmethod(function)
Return a class method for function.

A class method receives the class as implicit first argument, just like an instance method receives the instance. To declare a class method, use this idiom:

class C:
    @classmethod
    def f(cls, arg1, arg2, ...): ...

The @classmethod form is a function decorator – see the description of function definitions in Function definitions for details.

It can be called either on the class (such as C.f()) or on an instance (such as C().f()). The instance is ignored except for its class. If a class method is called for a derived class, the derived class object is passed as the implied first argument.

Class methods are different than C++ or Java static methods. If you want those, see staticmethod() in this section.

For more information on class methods, consult the documentation on the standard type hierarchy in The standard type hierarchy.

New in version 2.2.

Changed in version 2.4: Function decorator syntax added.

staticmethod(function)
Return a static method for function.

A static method does not receive an implicit first argument. To declare a static method, use this idiom:

class C:
    @staticmethod
    def f(arg1, arg2, ...): ...

The @staticmethod form is a function decorator – see the description of function definitions in Function definitions for details.

It can be called either on the class (such as C.f()) or on an instance (such as C().f()). The instance is ignored except for its class.

Static methods in Python are similar to those found in Java or C++. For a more advanced concept, see classmethod() in this section.

For more information on static methods, consult the documentation on the standard type hierarchy in The standard type hierarchy.

New in version 2.2.

Changed in version 2.4: Function decorator syntax added.

—————————————————————————

Being educated under Java background, static method and class method are the same thing.

But not so in Python, there is subtle difference:

Say function a() is defined in Parent Class, while Sub Class extends Parent Class

If function a() has @staticmethod decorator, Sub.a() still refers to definition inside Parent Class. Whereas,

If function a() has @classmethod decorator, Sub.a() will points definition inside Sub Class.

Let’s talk about some definitions here:

@staticmethod function is nothing more than a function defined inside a class. It is callable without instantiating the class first. It’s definition is immutable via inheritance.

@classmethod function also callable without instantiating the class, but its definition follows Sub class, not Parent class, via inheritance. That’s because the first argument for @classmethod function must always be cls (class).

————————————————-

Usually, static methods would be used for Singleton or Factory Patterns (http://en.wikipedia.org/wiki/Factory_method_pattern), but if you review this page, you will see Python doesn’t require a static method to be used to implement the Factory pattern.

Coming up with an example for static methods can be difficult. I usually use static methods in python for organization purposes. For instance, if I made a DB abstraction layer and had a few functions that were related to the DB layer, but weren’t necessary used in the class, I would added them as static methods. Keeps those functions organized in once place, plus I find it easier to remember.

For example:

>>> data = db_convertAsciiToHtml(”Foo & Bar”)

OR

>>> data = DB.convertAsciiToHtml(”Foo & Bar”)

References:

http://www.techexperiment.com/2008/08/21/creating-static-methods-in-python/

http://docs.python.org/library/functions.html

http://rapd.wordpress.com/2008/07/02/python-staticmethod-vs-classmethod/

Read Full Post | Make a Comment ( 1 so far )

Strip attachments from an email message

Posted on December 7, 2009. Filed under: Linux, Python | Tags: , , , |

This recipe shows a simple approach to using the Python email package to strip out attachments and file types from an email message that might be considered dangerous. This is particularly relevant in Python 2.4, as the email Parser is now much more robust in handling mal-formed messages (which are typical for virus and worm emails)

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
ReplaceString = """

This message contained an attachment that was stripped out. 

The original type was: %(content_type)s
The filename was: %(filename)s, 
(and it had additional parameters of:
%(params)s)

"""

import re
BAD_CONTENT_RE = re.compile('application/(msword|msexcel)', re.I)
BAD_FILEEXT_RE = re.compile(r'(\.exe|\.zip|\.pif|\.scr|\.ps)$')

def sanitise(msg):
    # Strip out all payloads of a particular type
    ct = msg.get_content_type()
    # We also want to check for bad filename extensions
    fn = msg.get_filename()
    # get_filename() returns None if there's no filename
    if BAD_CONTENT_RE.search(ct) or (fn and BAD_FILEEXT_RE.search(fn)):
        # Ok. This part of the message is bad, and we're going to stomp
        # on it. First, though, we pull out the information we're about to
        # destroy so we can tell the user about it.

        # This returns the parameters to the content-type. The first entry
        # is the content-type itself, which we already have.
        params = msg.get_params()[1:] 
        # The parameters are a list of (key, value) pairs - join the
        # key-value with '=', and the parameter list with ', '
        params = ', '.join([ '='.join(p) for p in params ])
        # Format up the replacement text, telling the user we ate their
        # email attachment.
        replace = ReplaceString % dict(content_type=ct, 
                                       filename=fn, 
                                       params=params)
        # Install the text body as the new payload.
        msg.set_payload(replace)
        # Now we manually strip away any paramaters to the content-type 
        # header. Again, we skip the first parameter, as it's the 
        # content-type itself, and we'll stomp that next.
        for k, v in msg.get_params()[1:]:
            msg.del_param(k)
        # And set the content-type appropriately.
        msg.set_type('text/plain')
        # Since we've just stomped the content-type, we also kill these
        # headers - they make no sense otherwise.
        del msg['Content-Transfer-Encoding']
        del msg['Content-Disposition']
    else:
        # Now we check for any sub-parts to the message
        if msg.is_multipart():
            # Call the sanitise routine on any subparts
            payload = [ sanitise(x) for x in msg.get_payload() ]
            # We replace the payload with our list of sanitised parts
            msg.set_payload(payload)
    # Return the sanitised message
    return msg

# And a simple driver to show how to use this
import email, sys
m = email.message_from_file(open(sys.argv[1]))
print sanitise(m)

Discussion

I’ve seen this come up a few times on comp.lang.python, so here’s a cookbook entry for it. This recipe shows how to read in an email message, strip out any dangerous or suspicious attachments, and replace them with a harmless text message informing the user of this.

This is particularly important if the end-users are using something like Outlook, which is targetted by unpleasant virus and worm messages on a daily basis.

The email parser in Python 2.4 has been completely rewritten to be robust first, correct second – prior to this, the parser was written for correctness first. This was a problem, because many virus/worm messages would send email messages that were broken and non-conformant – this made the old email parser choke and die. The new parser is designed to never actually break when reading a message – instead it tries it’s best to fix up whatever it can in the message. (If you have a message that causes the parser to crash, please let us know – that’s a bug, and we’ll fix it).

The code itself is heavily commented, and should be easy enough to follow. A mail message consists of one or more parts – these can each contain nested parts. We call the ‘sanitise()’ function on the top level Message object, and it calls itself recursively on the sub-objects. The sanitise() function checks the Content-Type of the part, and if there’s a filename, also checks that, against a known-to-be-bad list.

If the message part is bad, we replace the message itself with a short text description describing the now-removed part, and clean out the headers that are relevant. We set this message part’s Content-Type to ‘text/plain’, and remove other headers that related to the now-removed message.

Finally, we check if the message is a multipart message. This means it has sub-parts, so we recursively call the sanitise function on each of those. We then replace the payload with our list of sanitised sub-parts.

Extensions, further work, etc:

Instead of destroying the attachment, it would be a small amount of work to instead store the attachment away in a directory, and supply the user with a link to the file.

You could add other filters into the sanitise() code – for instance, checking other headers for known signs of worm or virus messages. Or removing all large powerpoint files sent to you by your marketing department, if that’s what you want to do.

Reference:

http://code.activestate.com/recipes/302086/

Read Full Post | Make a Comment ( None so far )

when “import twitter” report “ImportError: No module named simplejson”

Posted on August 27, 2009. Filed under: Python |

When I use python-twitter in windows python 2.6:

“import twitter” will have error “ImportError: No module named simplejson”

The simplest solve method:

1. check if there is package  json: import json

2. find the “Python26\Lib\site-packages\twitter.py”, modify the beginning line “import simplejson” to ”import json as simplejson’

3. recall again, it should be solved

References:

http://code.google.com/p/python-twitter/

Read Full Post | Make a Comment ( None so far )

Skype Auto Answer launch!

Posted on August 18, 2009. Filed under: Programming, Python |

http://code.google.com/p/skypeautoanswer/

Read Full Post | Make a Comment ( None so far )

Python: ImportError: No module named _md5

Posted on June 21, 2009. Filed under: Python |

Python 2.5.1 (r251:54863, Sep 3 2007, 17:35:15)
[GCC 3.3.3 20040412 (Red Hat Linux 3.3.3-7)] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import md5
Traceback (most recent call last):
File “”, line 1, in
File “/usr/lib/python2.5/md5.py”, line 6, in
from hashlib import md5
File “/usr/lib/python2.5/hashlib.py”, line 133, in
md5 = __get_builtin_constructor(‘md5’)
File “/usr/lib/python2.5/hashlib.py”, line 60, in __get_builtin_constructor
import _md5
ImportError: No module named _md5

Searching on the Internet, It is caused incompatible by Python 2.5.1 and openssl-0.9.8a, the python 2.5.1 need the openssl library (symbol link: libssl.so.4 and libcrypto.so.4, and also libc.so.6 which is supplied by libc), but the openssl-0.9.8a only supply the libssl.so.6 and libcrypto.so.6 at /lib/, solution as following:
1. login as user “root”
2. cd /lib/
3. ln –s libssl.so.0.9.8e libssl.so.4
4. ln -s libcrypto.so.0.9.8e libcrypto.so.4
5. check in the python: execute python, then input “import md5”, if there is no output, the bug is fixed.

Read Full Post | Make a Comment ( 4 so far )

Howto run a sub thread/process to monitor the system status in Python?

Posted on May 22, 2009. Filed under: Linux, Python |

Howto run a sub thread/process to monitor the system status in Python?

 1. popen  or popen2

 This will call another process by /bin/sh, so if you want get the pid, the result pid maybe the pid of /bin/sh

 Examples:

1.1   without log

>>>Import popen2, os

>>>Cur = popen2.Popen4(“vmstat –n 100”)

>>>Cur.pid

123

>>> 

 

ps aux|grep vmstat

root       123    0.0  0.1   1808   564 pts/9    S+   10:57   0:00 vmstat -n 100

 

1.2    With log

>>>import popen2, os

>>>cur = popen2.Popen4(“vmstat –n 100 > vmstat.log”)

>>>cur.pid

124

>>> 

 

ps aux|grep vmstat

root       124  0.1  0.2   4488   996 pts/9    S+   11:01   0:00 /bin/sh -c vmstat -n 100 > vmstat.log

root       125  0.0  0.1   1804   560 pts/9    S+   11:01   0:00 vmstat -n 100

 

>>>os.kill(cur.pid, 9)

>>>os.waitpid(cur.pid,0)

(124, 9)

>>> 

 

ps aux|grep vmstat

root       125  0.0  0.1   1804   560 pts/9    S+   11:01   0:00 vmstat -n 100

 

From the examples, you can see that, if you want the sub-thread record some log from the command, you’d better do not use this kind of popen2 lib, you can use following method

 

2. subprocess

 

2.1 Start the subprocess:

>>>Outlog = open(outputlogname, “w”)

>>>errlog=open(errlogname, “w”)

>>>try:

>>>    process=subprocess.Popen([“vmstat”,”-n”,”100”], stdout=outlog, stderr=errlog)

>>>except  OSError,e:

>>>    print “The OSError is:”,e

>>>    print “Maybe  this command is not exist”

>>>except Exception,e2:

>>>    print e

>>>pid = process.pid

134

 

ps aux|grep vmstat

root       134  0.0  0.1   1804   560 pts/9    S+   11:01   0:00 vmstat -n 100

 

You can see here: there is no “/bin/sh” process, the Popen have a parameter “shell=True/False”:

On Unix, with shell=False (default): In this case, the Popen class uses os.execvp() to execute the child program. args should normally be a sequence. A string will be treated as a sequence with the string as the only item (the program to execute).

On Unix, with shell=True: If args is a string, it specifies the command string to execute through the shell. If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional shell arguments.

 

2.2 Kill the sub process

>>>os.kill(process.pid, 9)

134

>>>os.waitpid(process.pid, 0)

(134, 9)

>>>outlog.close()

>>>errlog.close()

 

 

 

Reference:

http://mail.python.org/pipermail/python-list/1999-May/002214.html

http://docs.python.org/library/subprocess.html

http://docs.python.org/library/popen2.html

http://mail.python.org/pipermail/python-list/2004-May/260937.html

http://code.activestate.com/recipes/496960/

http://www.daniweb.com/forums/thread36752.html#

http://groups.google.com/group/comp.lang.python/msg/9fa3a3c287e8e2a3?hl=en&

http://code.activestate.com/recipes/52296/

Read Full Post | Make a Comment ( None so far )

Python http upload script

Posted on May 15, 2009. Filed under: Python |

Python http upload script

 Httplib module and some useful methods

 Note   The httplib module has been renamed to http.client in Python 3.0. The 2to3 tool will automatically adapt imports when converting your sources to 3.0.

class httplib.HTTPConnection(host[, port[, strict[, timeout]]])

 HTTPConnection.request(method, url[, body[, headers]])

 HTTPConnection.getresponse()

Note   Note that you must have read the whole response before you can send a new request to the server

 HTTPConnection.set_debuglevel(level)

Set the debugging level (the amount of debugging output printed). The default debug level is 0, meaning no debugging output is printed.

HTTPConnection.connect()

Connect to the server specified when the object was created.

HTTPConnection.close()

Close the connection to the server.

As an alternative to using the request() method described above, you can also send your request step by step, by using the four functions below.

HTTPConnection.putrequest(request, selector[, skip_host[, skip_accept_encoding]])

This should be the first call after the connection to the server has been made. It sends a line to the server consisting of the request string, the selector string, and the HTTP version (HTTP/1.1). To disable automatic sending of Host: or Accept-Encoding: headers (for example to accept additional content encodings), specify skip_host or skip_accept_encoding with non-False values.

Changed in version 2.4: skip_accept_encoding argument added.

HTTPConnection.putheader(header, argument[, ])

Send an RFC 822-style header to the server. It sends a line to the server consisting of the header, a colon and a space, and the first argument. If more arguments are given, continuation lines are sent, each consisting of a tab and an argument.

HTTPConnection.endheaders()

Send a blank line to the server, signalling the end of the headers.

HTTPConnection.send(data)

Send data to the server. This should be used directly only after the endheaders() method has been called and before getresponse() is called.

 

 Example of upload file by PUT

 import httplib

conn = httplib.HTTPConnection(“192.168.1.1”)

conn.set_debuglevel(10)

conn.putrequest(‘PUT’,’http://192.168.1.1/cgi-bin/upload.php’)

conn.putheader(“Content-Length”, “32”)

conn.endheaders()

conn.send(“Hello world, I am uploading 32 bytes, if the length is less than 32 bytes, the script will halt here, if more than 32 bytes, the upload.php will only read the first 32 bytes”)

resps = conn.getresponse()

data = resps.read()

print “response is the webpage which upload.php shows after PUT upload:”, data

 

References:

httplib — HTTP protocol client

http://docs.python.org/library/httplib.html

 urllib — Open arbitrary resources by URL

http://www.python.org/doc/2.6/library/urllib.html

Http client to POST using multipart/form-data

http://code.activestate.com/recipes/146306/

Big File Upload

http://webpython.codepoint.net/cgi_big_file_upload

httplib HTTPConnection request problem

http://bytes.com/groups/python/30065-httplib-httpconnection-request-problem

http://mail.python.org/pipermail/python-list/2004-July/272313.html

PyCURL interface – Uploading large binary files

http://curl.haxx.se/mail/archive-2004-02/0043.html

http://pycurl.sourceforge.net/

urlgrabber

http://linux.duke.edu/projects/urlgrabber/

http://linux.duke.edu/projects/urlgrabber/comparison.ptml

HTTP Upload — An Overview

http://www.chilkatsoft.com/p/p_200.asp

HTTP Upload using a Proxy Server

http://www.example-code.com/python/upload_proxy_server.asp

Read Full Post | Make a Comment ( None so far )

How To Use Linux epoll with Python

Posted on May 5, 2009. Filed under: Linux, Python | Tags: , , , , |

Contents

Introduction

As of version 2.6, Python includes an API for accessing the Linux epoll library. This article uses Python 3 examples to briefly demonstrate the API. Questions and feedback are welcome.

Blocking Socket Programming Examples

Example 1 is a simple Python 3.0 server that listens on port 8080 for an HTTP request message, prints it to the console, and sends an HTTP response message back to the client.

  • Line 9: Create the server socket.
  • Line 10: Permits the bind() in line 11 even if another program was recently listening on the same port. Otherwise this program could not run until a minute or two after the previous program using that port had finished.
  • Line 11: Bind the server socket to port 8080 of all available IPv4 addresses on this machine.
  • Line 12: Tell the server socket to start accepting incoming connections from clients.
  • Line 14: The program will stop here until a connection is received. When this happens, the server socket will create a new socket on this machine that is used to talk to the client. This new socket is represented by the clientconnection object returned from the accept() call. The address object indicates the IP address and port number at the other end of the connection.
  • Lines 15-17: Assemble the data being transmitted by the client until a complete HTTP request has been transmitted. The HTTP protocol is described at HTTP Made Easy.
  • Line 18: Print the request to the console, in order to verify correct operation.
  • Line 19: Send the response to the client.
  • Lines 20-22: Close the connection to the client as well as the listening server socket.

The official HOWTO has a more detailed description of socket programming with Python.

Example 1 (All examples use Python 3)

 1  import socket
 2
 3  EOL1 = b'\n\n'
 4  EOL2 = b'\n\r\n'
 5  response  = b'HTTP/1.0 200 OK\r\nDate: Mon, 1 Jan 1996 01:01:01 GMT\r\n'
 6  response += b'Content-Type: text/plain\r\nContent-Length: 13\r\n\r\n'
 7  response += b'Hello, world!'
 8
 9  serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
10  serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
11  serversocket.bind(('0.0.0.0', 8080))
12  serversocket.listen(1)
13
14  connectiontoclient, address = serversocket.accept()
15  request = b''
16  while EOL1 not in request and EOL2 not in request:
17     request += connectiontoclient.recv(1024)
18  print(request.decode())
19  connectiontoclient.send(response)
20  connectiontoclient.close()
21
22  serversocket.close()

Example 2 adds a loop in line 15 to repeatedly processes client connections until interrupted by the user (e.g. with a keyboard interrupt). This illustrates more clearly that the server socket is never used to exchange data with the client. Rather, it accepts a connection from a client, and then creates a new socket on the server machine that is used to communicate with the client.

The finally statement block in lines 23-24 ensures that the listening server socket is always closed, even if an exception occurs.

Example 2

 1  import socket
 2
 3  EOL1 = b'\n\n'
 4  EOL2 = b'\n\r\n'
 5  response  = b'HTTP/1.0 200 OK\r\nDate: Mon, 1 Jan 1996 01:01:01 GMT\r\n'
 6  response += b'Content-Type: text/plain\r\nContent-Length: 13\r\n\r\n'
 7  response += b'Hello, world!'
 8
 9  serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
10  serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
11  serversocket.bind(('0.0.0.0', 8080))
12  serversocket.listen(1)
13
14  try:
15     while True:
16        connectiontoclient, address = serversocket.accept()
17        request = b''
18        while EOL1 not in request and EOL2 not in request:
19            request += connectiontoclient.recv(1024)
20        print('-'*40 + '\n' + request.decode()[:-2])
21        connectiontoclient.send(response)
22        connectiontoclient.close()
23  finally:
24     serversocket.close()

Benefits of Asynchronous Sockets and Linux epoll

The sockets shown in Example 2 are called blocking sockets, because the Python program stops running until an event occurs. The accept() call in line 16 blocks until a connection has been received from a client. The recv() call in line 19 blocks until data has been received from the client (or until there is no more data to receive). The send() call in line 21 blocks until all of the data being returned to the client has been queued by Linux in preparation for transmission.

When a program uses blocking sockets it often uses one thread (or even a dedicated process) to carry out the communication on each of those sockets. The main program thread will contain the listening server socket which accepts incoming connections from clients. It will accept these connections one at a time, passing the newly created socket off to a separate thread which will then interact with the client. Because each of these threads only communicates with one client, it is ok if it is blocked from proceeding at certain points. This blockage does not prohibit any of the other threads from carrying out their respective tasks.

The use of blocking sockets with multiple threads results in straightforward code, but comes with a number of drawbacks. It can be difficult to ensure the threads cooperate appropriately when sharing resources. And this style of programming can be less efficient on computers with only one CPU.

The C10K Problem discusses some of the alternatives for handling multiple concurrent sockets. One is the use of asynchronous sockets. These sockets don’t block until some event occurs. Instead, the program performs an action on an asynchronous socket and is immediately notified as to whether that action succeeded or failed. This information allows the program to decide how to proceed. Since asynchronous sockets are non-blocking, there is no need for multiple threads of execution. All work may be done in a single thread. This single-threaded approach comes with its own challenges, but can be a good choice for many programs. It can also be combined with the multi-threaded approach: asynchronous sockets using a single thread can be used for the networking component of a server, and threads can be used to access other blocking resources, e.g. databases.

Linux 2.6 has a number of mechanisms for managing asynchronous sockets, three of which are exposed by the Python API’s select, poll and epoll.  epoll and poll are better than select because the Python program does not have to inspect each socket for events of interest. Instead it can rely on the operating system to tell it which sockets may have these events. And epoll is better than poll because it does not require the operating system to inspect all sockets for events of interest each time it is queried by the Python program. Rather Linux tracks these events as they occur, and returns a list when queried by Python. So epoll is a more efficient and scalable mechanism for large numbers (thousands) of concurrent socket connections, as shown in these graphs.

Asynchronous Socket Programming Examples with epoll

Programs using epoll often perform actions in this sequence:

  1. Create an epoll object
  2. Tell the epoll object to monitor specific events on specific sockets
  3. Ask the epoll object which sockets may have had the specified event since the last query
  4. Perform some action on those sockets
  5. Tell the epoll object to modify the list of sockets and/or events to monitor
  6. Repeat steps 3 through 5 until finished
  7. Destroy the epoll object

Example 3 duplicates the functionality of Example 2 while using asynchronous sockets. The program is more complex because a single thread is interleaving the communication with multiple clients.

  • Line 1: The select module contains the epoll functionality.
  • Line 13: Since sockets are blocking by default, this is necessary to use non-blocking (asynchronous) mode.
  • Line 15: Create an epoll object.
  • Line 16: Register interest in read events on the server socket. A read event will occur any time the server socket accepts a socket connection.
  • Line 19: The connection dictionary maps file descriptors (integers) to their corresponding network connection objects.
  • Line 21: Query the epoll object to find out if any events of interest may have occurred. The parameter “1” signifies that we are willing to wait up to one second for such an event to occur. If any events of interest occurred prior to this query, the query will return immediately with a list of those events.
  • Line 22: Events are returned as a sequence of (fileno, event code) tuples. fileno is a synonym for file descriptor and is always an integer.
  • Line 23: If a read event occurred on the socket server, then a new socket connection may have been created.
  • Line 25: Set new socket to non-blocking mode.
  • Line 26: Register interest in read (EPOLLIN) events for the new socket.
  • Line 31: If a read event occurred then read new data sent from the client.
  • Line 33: Once the complete request has been received, then unregister interest in read events and register interest in write (EPOLLOUT) events. Write events will occur when it is possible to send response data back to the client.
  • Line 34: Print the complete request, demonstrating that although communication with clients is interleaved this data can be assembled and processed as a whole message.
  • Line 35: If a write event occurred on a client socket, it’s able to accept new data to send to the client.
  • Lines 36-38: Send the response data a bit at a time until the complete response has been delivered to the operating system for transmission.
  • Line 39: Once the complete response has been sent, disable interest in further read or write events.
  • Line 40: A socket shutdown is optional if a connection is closed explicitly. This example program uses it in order to cause the client to shutdown first. The shutdown call informs the client socket that no more data should be sent or received and will cause a well-behaved client to close the socket connection from it’s end.
  • Line 41: The HUP (hang-up) event indicates that the client socket has been disconnected (i.e. closed), so this end is closed as well. There is no need to register interest in HUP events. They are always indicated on sockets that are registered with the epoll object.
  • Line 42: Unregister interest in this socket connection.
  • Line 43: Close the socket connection.
  • Lines 18-45: The try-catch block is included because the example program will most likely be interrupted by a KeyboardInterrupt exception
  • Lines 46-48: Open socket connections don’t need to be closed since Python will close them when the program terminates. They’re included as a matter of good form.
Example 3

 1  import socket, select
 2
 3  EOL1 = b'\n\n'
 4  EOL2 = b'\n\r\n'
 5  response  = b'HTTP/1.0 200 OK\r\nDate: Mon, 1 Jan 1996 01:01:01 GMT\r\n'
 6  response += b'Content-Type: text/plain\r\nContent-Length: 13\r\n\r\n'
 7  response += b'Hello, world!'
 8
 9  serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
10  serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
11  serversocket.bind(('0.0.0.0', 8080))
12  serversocket.listen(1)
13  serversocket.setblocking(0)
14
15  epoll = select.epoll()
16  epoll.register(serversocket.fileno(), select.EPOLLIN)
17
18  try:
19     connections = {}; requests = {}; responses = {}
20     while True:
21        events = epoll.poll(1)
22        for fileno, event in events:
23           if fileno == serversocket.fileno():
24              connection, address = serversocket.accept()
25              connection.setblocking(0)
26              epoll.register(connection.fileno(), select.EPOLLIN)
27              connections[connection.fileno()] = connection
28              requests[connection.fileno()] = b''
29              responses[connection.fileno()] = response
30           elif event & select.EPOLLIN:
31              requests[fileno] += connections[fileno].recv(1024)
32              if EOL1 in requests[fileno] or EOL2 in requests[fileno]:
33                 epoll.modify(fileno, select.EPOLLOUT)
34                 print('-'*40 + '\n' + requests[fileno].decode()[:-2])
35           elif event & select.EPOLLOUT:
36              byteswritten = connections[fileno].send(responses[fileno])
37              responses[fileno] = responses[fileno][byteswritten:]
38              if len(responses[fileno]) == 0:
39                 epoll.modify(fileno, 0)
40                 connections[fileno].shutdown(socket.SHUT_RDWR)
41           elif event & select.EPOLLHUP:
42              epoll.unregister(fileno)
43              connections[fileno].close()
44              del connections[fileno]
45  finally:
46     epoll.unregister(serversocket.fileno())
47     epoll.close()
48     serversocket.close()

epoll has two modes of operation, called edge-triggered and level-triggered. In the edge-triggered mode of operation a call to epoll.poll() will return an event on a socket only once after the read or write event occurred on that socket. The calling program must process all of the data associated with that event without further notifications on subsequent calls to epoll.poll(). When the data from a particular event is exhausted, additional attempts to operate on the socket will cause an exception. Conversely, in the level-triggered mode of operation, repeated calls to epoll.poll() will result in repeated notifications of the event of interest, until all data associated with that event has been processed. No exceptions normally occur in level-triggered mode.

For example, suppose a server socket has been registered with an epoll object for read events. In edge-triggered mode the program would need to accept() new socket connections until a socket.error exception occurs. Whereas in the level-triggered mode of operation a single accept() call can be made and then the epoll object can be queried again for new events on the server socket indicating that additional calls to accept() should be made.

Example 3 used level-triggered mode, which is the default mode of operation. Example 4 demonstrates how to use edge-triggered mode. In Example 4, lines 25, 36 and 45 introduce loops that run until an exception occurs (or all data is otherwise known to be handled). Lines 32, 38 and 48 catch the expected socket exceptions. Finally, lines 16, 28, 41 and 51 add the EPOLLET mask which is used to set edge-triggered mode.

Example 4

 1  import socket, select
 2
 3  EOL1 = b'\n\n'
 4  EOL2 = b'\n\r\n'
 5  response  = b'HTTP/1.0 200 OK\r\nDate: Mon, 1 Jan 1996 01:01:01 GMT\r\n'
 6  response += b'Content-Type: text/plain\r\nContent-Length: 13\r\n\r\n'
 7  response += b'Hello, world!'
 8
 9  serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
10  serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
11  serversocket.bind(('0.0.0.0', 8080))
12  serversocket.listen(1)
13  serversocket.setblocking(0)
14
15  epoll = select.epoll()
16  epoll.register(serversocket.fileno(), select.EPOLLIN | select.EPOLLET)
17
18  try:
19     connections = {}; requests = {}; responses = {}
20     while True:
21        events = epoll.poll(1)
22        for fileno, event in events:
23           if fileno == serversocket.fileno():
24              try:
25                 while True:
26                    connection, address = serversocket.accept()
27                    connection.setblocking(0)
28                    epoll.register(connection.fileno(), select.EPOLLIN | select.EPOLLET)
29                    connections[connection.fileno()] = connection
30                    requests[connection.fileno()] = b''
31                    responses[connection.fileno()] = response
32              except socket.error:
33                 pass
34           elif event & select.EPOLLIN:
35              try:
36                 while True:
37                    requests[fileno] += connections[fileno].recv(1024)
38              except socket.error:
39                 pass
40              if EOL1 in requests[fileno] or EOL2 in requests[fileno]:
41                 epoll.modify(fileno, select.EPOLLOUT | select.EPOLLET)
42                 print('-'*40 + '\n' + requests[fileno].decode()[:-2])
43           elif event & select.EPOLLOUT:
44              try:
45                 while len(responses[fileno]) > 0:
46                    byteswritten = connections[fileno].send(responses[fileno])
47                    responses[fileno] = responses[fileno][byteswritten:]
48              except socket.error:
49                 pass
50              if len(responses[fileno]) == 0:
51                 epoll.modify(fileno, select.EPOLLET)
52                 connections[fileno].shutdown(socket.SHUT_RDWR)
53           elif event & select.EPOLLHUP:
54              epoll.unregister(fileno)
55              connections[fileno].close()
56              del connections[fileno]
57  finally:
58     epoll.unregister(serversocket.fileno())
59     epoll.close()
60     serversocket.close()

Since they’re similar, level-triggered mode is often used when porting an application that was using the select or poll mechanisms, while edge-triggered mode may be used when the programmer doesn’t need or want as much assistance from the operating system in managing event state.

In addition to these two modes of operation, sockets may also be registered with the epoll object using the EPOLLONESHOT event mask. When this option is used, the registered event is only valid for one call to epoll.poll(), after which time it is automatically removed from the list of registered sockets being monitored.

Performance Considerations

Listen Backlog Queue Size

In Examples 1-4, line 12 has shown a call to the serversocket.listen() method. The parameter for this method is the listen backlog queue size. It tells the operating system how many TCP/IP connections to accept and place on the backlog queue before they are accepted by the Python program. Each time the Python program calls accept() on the server socket, one of the connections is removed from the queue and that slot can be used for another incoming connection. If the queue is full, new incoming connections are silently ignored causing unnecessary delays on the client side of the network connection. A production server usually handles tens or hundreds of simultaneous connections, so a value of 1 will usually be inadequate. For example, when using ab to perform load testing against these sample programs with 100 concurrent HTTP 1.0 clients, any backlog value less than 50 would often produce performance degradation.

TCP Options

The TCP_CORK option can be used to “bottle up” messages until they are ready to send. This option, illustrated in lines 34 and 40 of Examples 5, might be a good option to use for an HTTP server using HTTP/1.1 pipelining.

Example 5

 1  import socket, select
 2
 3  EOL1 = b'\n\n'
 4  EOL2 = b'\n\r\n'
 5  response  = b'HTTP/1.0 200 OK\r\nDate: Mon, 1 Jan 1996 01:01:01 GMT\r\n'
 6  response += b'Content-Type: text/plain\r\nContent-Length: 13\r\n\r\n'
 7  response += b'Hello, world!'
 8
 9  serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
10  serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
11  serversocket.bind(('0.0.0.0', 8080))
12  serversocket.listen(1)
13  serversocket.setblocking(0)
14
15  epoll = select.epoll()
16  epoll.register(serversocket.fileno(), select.EPOLLIN)
17
18 try:
19     connections = {}; requests = {}; responses = {}
20     while True:
21        events = epoll.poll(1)
22        for fileno, event in events:
23           if fileno == serversocket.fileno():
24              connection, address = serversocket.accept()
25              connection.setblocking(0)
26              epoll.register(connection.fileno(), select.EPOLLIN)
27              connections[connection.fileno()] = connection
28              requests[connection.fileno()] = b''
29              responses[connection.fileno()] = response
30           elif event & select.EPOLLIN:
31              requests[fileno] += connections[fileno].recv(1024)
32              if EOL1 in requests[fileno] or EOL2 in requests[fileno]:
33                 epoll.modify(fileno, select.EPOLLOUT)
34                 connections[fileno].setsockopt(socket.IPPROTO_TCP, socket.TCP_CORK, 1)
35                 print('-'*40 + '\n' + requests[fileno].decode()[:-2])
36           elif event & select.EPOLLOUT:
37              byteswritten = connections[fileno].send(responses[fileno])
38              responses[fileno] = responses[fileno][byteswritten:]
39              if len(responses[fileno]) == 0:
40                 connections[fileno].setsockopt(socket.IPPROTO_TCP, socket.TCP_CORK, 0)
41                 epoll.modify(fileno, 0)
42                 connections[fileno].shutdown(socket.SHUT_RDWR)
43           elif event & select.EPOLLHUP:
44              epoll.unregister(fileno)
45              connections[fileno].close()
46              del connections[fileno]
47  finally:
48     epoll.unregister(serversocket.fileno())
49     epoll.close()
50     serversocket.close()

On the other hand, the TCP_NODELAY option can be used to tell the operating system that any data passed to socket.send() should immediately be sent to the client without being buffered by the operating system. This option, illustrated in line 14 of Example 6, might be a good option to use for an SSH client or other “real-time” application.

Example 6

 1  import socket, select
 2
 3  EOL1 = b'\n\n'
 4  EOL2 = b'\n\r\n'
 5  response  = b'HTTP/1.0 200 OK\r\nDate: Mon, 1 Jan 1996 01:01:01 GMT\r\n'
 6  response += b'Content-Type: text/plain\r\nContent-Length: 13\r\n\r\n'
 7  response += b'Hello, world!'
 8
 9  serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
10  serversocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
11  serversocket.bind(('0.0.0.0', 8080))
12  serversocket.listen(1)
13  serversocket.setblocking(0)
14  serversocket.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
15
16  epoll = select.epoll()
17  epoll.register(serversocket.fileno(), select.EPOLLIN)
18
19 try:
20     connections = {}; requests = {}; responses = {}
21     while True:
22        events = epoll.poll(1)
23        for fileno, event in events:
24           if fileno == serversocket.fileno():
25              connection, address = serversocket.accept()
26              connection.setblocking(0)
27              epoll.register(connection.fileno(), select.EPOLLIN)
28              connections[connection.fileno()] = connection
29              requests[connection.fileno()] = b''
30              responses[connection.fileno()] = response
31           elif event & select.EPOLLIN:
32              requests[fileno] += connections[fileno].recv(1024)
33              if EOL1 in requests[fileno] or EOL2 in requests[fileno]:
34                 epoll.modify(fileno, select.EPOLLOUT)
35                 print('-'*40 + '\n' + requests[fileno].decode()[:-2])
36           elif event & select.EPOLLOUT:
37              byteswritten = connections[fileno].send(responses[fileno])
38              responses[fileno] = responses[fileno][byteswritten:]
39              if len(responses[fileno]) == 0:
40                 epoll.modify(fileno, 0)
41                 connections[fileno].shutdown(socket.SHUT_RDWR)
42           elif event & select.EPOLLHUP:
43              epoll.unregister(fileno)
44              connections[fileno].close()
45              del connections[fileno]
46  finally:
47     epoll.unregister(serversocket.fileno())
48     epoll.close()
49     serversocket.close()

Source Code

The examples on this page are in the public domain and available for download.

References:

http://scotdoyle.com/python-epoll-howto.html

http://mail.python.org/pipermail/python-dev/2003-August/037614.html

http://ionelmc.wordpress.com/2008/04/01/curious-difference-between-epoll-poll-kqueue-and-select/

http://ilab.cs.byu.edu/python/select/echoserver.html

http://docs.python.org/library/select.html

Read Full Post | Make a Comment ( None so far )

wxpython time related tips

Posted on February 23, 2009. Filed under: Linux, Python, Windows |

1. datetime and time convert

from datetime import datetime
import time

#————————————————-
# conversions to strings
#————————————————-
# datetime object to string
dt_obj = datetime(2008, 11, 10, 17, 53, 59)
date_str = dt_obj.strftime(“%Y-%m-%d %H:%M:%S”)
print date_str

# time tuple to string
time_tuple = (2008, 11, 12, 13, 51, 18, 2, 317, 0)
date_str = time.strftime(“%Y-%m-%d %H:%M:%S”, time_tuple)
print date_str

#————————————————-
# conversions to datetime objects
#————————————————-
# time tuple to datetime object
time_tuple = (2008, 11, 12, 13, 51, 18, 2, 317, 0)
dt_obj = datetime(*time_tuple[0:6])
print repr(dt_obj)

# date string to datetime object
date_str = “2008-11-10 17:53:59”
dt_obj = datetime.strptime(date_str, “%Y-%m-%d %H:%M:%S”)
print repr(dt_obj)

# timestamp to datetime object in local time
timestamp = 1226527167.595983
dt_obj = datetime.fromtimestamp(timestamp)
print repr(dt_obj)

# timestamp to datetime object in UTC
timestamp = 1226527167.595983
dt_obj = datetime.utcfromtimestamp(timestamp)
print repr(dt_obj)

#————————————————-
# conversions to time tuples
#————————————————-
# datetime object to time tuple
dt_obj = datetime(2008, 11, 10, 17, 53, 59)
time_tuple = dt_obj.timetuple()
print repr(time_tuple)

# string to time tuple
date_str = “2008-11-10 17:53:59”
time_tuple = time.strptime(date_str, “%Y-%m-%d %H:%M:%S”)
print repr(time_tuple)

# timestamp to time tuple in UTC
timestamp = 1226527167.595983
time_tuple = time.gmtime(timestamp)
print repr(time_tuple)

# timestamp to time tuple in local time
timestamp = 1226527167.595983
time_tuple = time.localtime(timestamp)
print repr(time_tuple)

#————————————————-
# conversions to timestamps
#————————————————-
# time tuple in local time to timestamp
time_tuple = (2008, 11, 12, 13, 59, 27, 2, 317, 0)
timestamp = time.mktime(time_tuple)
print repr(timestamp)

#————————————————-
# results
#————————————————-
# 2008-11-10 17:53:59
# 2008-11-12 13:51:18
# datetime.datetime(2008, 11, 12, 13, 51, 18)
# datetime.datetime(2008, 11, 10, 17, 53, 59)
# datetime.datetime(2008, 11, 12, 13, 59, 27, 595983)
# datetime.datetime(2008, 11, 12, 21, 59, 27, 595983)
# (2008, 11, 10, 17, 53, 59, 0, 315, -1)
# (2008, 11, 10, 17, 53, 59, 0, 315, -1)
# (2008, 11, 12, 21, 59, 27, 2, 317, 0)
# (2008, 11, 12, 13, 59, 27, 2, 317, 0)
# 1226527167.0

2. wx.Datetime and python datetime convert

datetime.datetime.fromtimestamp(wx.DateTime.Now().GetTicks())
wx.DateTimeFromTimeT(time.mktime(datetime.datetime.now().timetuple()))

Note that GetTicks() only has second precision

References:

http://www.saltycrane.com/blog/2008/11/python-datetime-time-conversions/

http://aspn.activestate.com/ASPN/Mail/Message/wxpython-users/3562592

Read Full Post | Make a Comment ( None so far )

Python Trouble Shooting

Posted on August 10, 2008. Filed under: MySQL, Python, Windows | Tags: , |

1. If you get data from mysql

After the mysql_connection.execute(“some sql”), then get the result by line =mysql_connection.fetchone(), and the results are listed by line[0], line[1], line[2] …..(not the “key”->”value” pair)

2. in wxPython, the toolbar should add toolbar.Realize()

If there is no Realize() at the end of the toolbar show segment, the button will not show

toolbar = self.CreateToolBar()
toolbar.AddTool(ID_ABC, wx.Bitmap(‘icons/icon.png’))
self.Bind(wx.EVT_TOOL, self.OnChangeDepth, id=ID_ABC)
toolbar.Realize()

3.

pytz.UnknownTimeZoneError: ‘US/Central’ after py2exe in python 2.6

I noticed that the old version of pytz I was using compiled each timezone into a .pyc, and these would be included in the resulting library.zip for my programs. When I build against the new pytz, these files are no longer getting compiled to .pyc. Instead, when I check the pytz directory in library.zip, I see these files:
__init__.pyc
reference.pyc
tzfile.pyc
tzinfo.pyc

It appears that the zoneinfo directory is missing.

Solution:
in file build.py/setup.py for the py2exe

import py2exe
setup(
console=[‘test.py’],
options={
‘py2exe’: {
‘packages’ : [‘matplotlib’, ‘pytz’],
}
},
)

The “packages” in “options” is very important.

References:
http://www.py2exe.org/index.cgi/MatPlotLib
http://osdir.com/ml/python.py2exe/2004-10/msg00040.html
http://www.nabble.com/Python-2.6-%2B-Pytz-2009a-%2B–Py2exe-problem-tt22574634.html#a22574634

4.

DeprecationWarning: the sets module is deprecated

There are two methods:
4.1. python -W ignore::DeprecationWarning

4.2. in Python26\Lib\sets.py  comment line 83-85
#import warnings
#warnings.warn(“the sets module is deprecated”, DeprecationWarning,stacklevel=2)

5. Socket timeout in xmlrpclib

import xmlrpclib
import socket
socket.setdefaulttimeout(10)        #set the timeout to 10 seconds
x = xmlrpclib.ServerProxy('http://1.2.3.4')

x.func_name(args)                   #times out after 10 seconds
socket.setdefaulttimeout(None)      #sets the default back

Reference:

http://stackoverflow.com/questions/372365/set-timeout-for-xmlrpclib-serverproxy

http://code.activestate.com/

Read Full Post | Make a Comment ( None so far )

Liked it here?
Why not try sites on the blogroll...