



I have a python script that is a http-server:, when benchmarking it against ApacheBench (ab) with a concurrency level (-c switch) that is lower then or equal to the value i specified in the socket.listen()-call in the sourcecode everything works fine, but as soon as put the concurrency level in apache bench above the value in the socket.listen()-call performance drops through the floor, some example:

Nothing changes in the code between the two calls, I can’t figure out what is wrong - been at this problem for one day now. Also note that: The multiplexing version of the same code (I wrote to compare to the threaded version) works FINE no matter what socket.listen() is set to or what the concurrency (-c switch) in apache is set to.

I've spent a day on IRC/python docs, posted on comp.lang.python and on my blog - I can't find ANYONE that even has an idea what could be wrong. Help me!


rcar: Yes some performance hit, of course - but dropping from 1400req/s to 32req/s because I make ONE more request then what the backlog can handle? And by using that logic the multiplexing server should suffer from the same performance hit - since it only can "accept" one connection at a time also I don't use a different process to handle the connections in the multiplexing version - it's all done in the main loop. The multiplexing code looks like this: which is very similair to the threaded code.

Thanks for posting your other code too. I'll take a look in a sec. I still suspect it's blocking somewhere in the threaded version, though the retransmitted SYN packets described in the new answer would probably make sense too.
Yes, I've also thought of a block somewhere in the threaded version - but the thing is, I can't find it - anywhere, all sockets are non-blocking, all threads are non-blocking (using .acquire(0)) and I even tried making the server-socket non-blocking, no luck.

I found this article on backlog on tomcat / java which gives an interesting insight in the backlog:

for example, if all threads are busy in java handling requests, the kernel will handle SYN and TCP handshakes until its backlog is full. when the backlog is full, it will simply drop future SYN requests. it will not send a RST, ie causing "Connection refused" on the client, instead the client will assume the package was lost and retransmit the SYN. hopefully, the backlog queue will have cleared up by then.

As I interpret it, by asking ab to create more simultaneous connection than your socket is configured to handle packets get dropped, not refused, and I do not know how ab handles that. It may be that it retransmits the SYN, but possibly after waiting a while. This may even be specced somewhere (TCP protocol?).

As said, I do not know but I hope this hints at the cause.

Good luck!

A: Thanks for the article, it was a good read - but then the problem would show up in both the single-threaded multiplexing version and the multithreaded version of the code? it doesn't, i'm going crazy over this ;(

+7  A: 

I cannot confirm your results, and your server is coded fishy. I whipped up my own server and do not have this problem either. Let's move the discussion to a simpler level:

import thread, socket, Queue

connections = Queue.Queue()
num_threads = 10
backlog = 10

def request():
    while 1:
        conn = connections.get()
        data = ''
        while '\r\n\r\n' not in data:
            data += conn.recv(4048)
        conn.sendall('HTTP/1.1 200 OK\r\n\r\nHello World')

if __name__ == '__main__':
    for _ in range(num_threads):
        thread.start_new_thread(request, ())

    acceptor = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    acceptor.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    acceptor.bind(('', 1234))
    while 1:
        conn, addr = acceptor.accept()

which on my machine does:

ab -n 10000 -c 10 --> 8695.03 [#/sec]
ab -n 10000 -c 11 --> 8529.41 [#/sec]
Florian Bösch
Florian: Awesome, the server is coded so that each thread itself is multiplexing also able to handle and arbitrary number of connections async. But you did shine some light on how to structure it better, thanks a lot. I'll look into rewriting the code matching your example more closeley.
Also in my version each thread is multiplexing within itself, instead of reading one request blockingly as in yours, could this have something to do with it?
That's an interesting question, I don't know. You could try removing that multiplexing from your version and test.
Florian Bösch

it looks like you're not really getting concurrency. apparently, when you do socket.accept(), the main thread doesn't go immediately back to waiting for the next connection. maybe your connection-handling thread is only python code, so you're getting sequentialized by the SIL (single interpreder lock).

if there's not heavy communications between threads, better use a multi-process scheme (with a pool of pre-spawned processes, of course)

Javier: I tried a multi-process scheme, but I had trouble passing the FD of the accepted connections over to the worker processes (in python atleast)
+4  A: 

For the heck of it I also implemented an asynchronous version:

import socket, Queue, select

class Request(object):
    def __init__(self, conn):
        self.conn = conn
        self.fileno = conn.fileno
        self.perform = self._perform().next

    def _perform(self):
        data = self.conn.recv(4048)
        while '\r\n\r\n' not in data:
            msg = self.conn.recv(4048)
            if msg:
                data += msg

        data = 'HTTP/1.1 200 OK\r\n\r\nHello World'
        while data:
            sent = self.conn.send(data)
            data = data[sent:]

class Acceptor:
    def __init__(self):
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        sock.bind(('', 1234))
        self.sock = sock
        self.fileno = sock.fileno

    def perform(self):
        conn, addr = self.sock.accept()

if __name__ == '__main__':
    reading = [Acceptor()]
    writing = list()

    while 1:
        readable, writable, error =, writing, [])
        for action in readable + writable:
            try: action.perform()
            except StopIteration: pass

which performs:

ab -n 10000 -c 10 --> 16822.13 [#/sec]
ab -n 10000 -c 11 --> 15704.41 [#/sec]
Florian Bösch
Nice, similar to my own async/multiplexing but still different from my threading one, since there are no well.. threads ;p.I did port the first code example you gave to a threaded and multiplexing one, and i get the same error, im thinking there's something wierd going on between thread and select

Ok, so I ran the code on a totally different server - (a vps I got at slicehost), not a single problem (everything works as expected) so honestly I think it's something wrong with my laptop now ;p

Thanks for everyones help though!
