Saturday, August 21, 2010

Multiple event loops with asyncore and asynchat.async_chat and threads

I recently needed to support multiple concurrent event loops with asyncore, and found a version specific obstacle in asynchat and came up with a simple and clean workaround.


I was developing a protocol client library that needed to be asynchronous.  My code used asyncore and asynchat, and worked in a single-threaded environment.  But then my needs evolved and it now needed to support execution from multiple threads.  Unfortunately, the asyncore and asynchat modules in the Python Standard Library are not thread-safe.  When two threads enter the main event loop in asyncore.loop() they collide spectacularly, causing spurious IO exceptions and corrupting asyncore's internal data structures.

I read the asyncore.py code and saw why it wasn't thread-safe.  The way the main event loop in asyncore.loop() works is that a module-scoped dict() named socket_map is used to store the sockets that are then used with select() or poll().  Without getting into the deep details of how asyncore works, it's enough to note that asyncore's use of it's own private socket_map dict() is the source of the thread-safety problems.  When two threads both operate on this shared object, they quickly render it corrupt.


In python 2.4, the asyncore.dispatcher class constructor gained a new optional map parameter.  It is used to override the internal socket_map variable, letting you supply your own dict() object.  This lets you provide a different dict() object to each instance of asyncore.dispatcher, making it perfectly thread-safe.  The module-level asyncore.loop() method also takes an optional map parameter, so you can have multiple threads safely running in their own individual event loops.  You still must ensure your threads are safe for any other data they share, but that is easily solved using threading.Lock and other tools from the standard threading module.

This works great, even for derived classes of asyncore.dispatcher, like asynchat.async_chat, a class that provides a simplified interface and somewhat flexible input buffering features with its set_terminator() method and found_terminator() callback.


But there is one small problem that only affects python environments earlier than 2.6.  The asynchat.async_chat class constructor doesn't implement the map parameter.  This causes the asyncore.dispatch class to receive a default value of map=None which tells it to use the shared module-level socket_map object.  Newer versions of async_chat in python 2.6+ provide the map parameter, but unfortunately any code intended to run on stock RHEL 5 is stuck with python 2.4.

The following code adds support for the map parameter to the asyncore.async_chat class by creating a new class named AsyncChat26:

import asyncore, asynchat, sys
class AsyncChat26(asynchat.async_chat):
    '''helper to fix for python2.4 asynchat missing a 'map' parameter'''
    def __init__ (self, conn=None, map=None):
        # if python version < 2.6:
        if sys.version_info[0:2] < (2,6):
            # python 2.4 and 2.5 need to do this:
            self.ac_in_buffer = ''
            self.ac_out_buffer = ''
            self.producer_fifo = asynchat.fifo()
            # the fix passes 'map' to the superclass constructor
            asyncore.dispatcher.__init__ (self, conn, map)
        else:
            # otherwise, we defer 100% to the parent class, since it works fine
            asynchat.async_chat.__init__(self, conn, map) 


It works by simple subclassing.  It works by overriding the asynchat.async_chat constructor adding the missing map parameter, and re-implements 4 lines of setup code from the newer async_chat constructor.  The code then invokes the parent class constructor directly but includes the map parameter.  If the code executes in python 2.6+, it simply delegates all work up to the original constructor.

For code that subclasses asynchat.async_chat, you can simply use AsyncChat26 as the new parent class, and your class will support map whether it runs on python 2.4 or python2.6+.  Multiple threads in python 2.4 may now run separate asyncore.loop() main event loops without colliding.