views:

316

answers:

2

I'm new to Python (I have been programming in Java for multiple years now though), and I am working on a simple socket-based networking application (just for fun). The idea is that my code connects to a remote TCP end-point and then listens for any data being pushed from the server to the client, and perform some parsing on this.

The data being pushed from server -> client is UTF-8 encoded text, and each line is delimited by CRLF (\x0D\x0A). You probably guessed: the idea is that the client connects to the server (until cancelled by the user), and then reads and parses the lines as they come in.

I've managed to get this to work, however, I'm not sure that I'm doing this quite the right way. So hence my actual questions (code to follow):

  1. Is this the right way to do it in Python (ie. is it really this simple)?
  2. Any tips/tricks/useful resources (apart from the reference documentation) regarding buffers/asyncore?

Currently, the data is being read and buffered as follows:

def handle_read(self):
    self.ibuffer = b""

    while True:
        self.ibuffer += self.recv(self.buffer_size)
        if ByteUtils.ends_with_crlf(self.ibuffer):
            self.logger.debug("Got full line including CRLF")
            break
        else:
            self.logger.debug("Buffer not full yet (%s)", self.ibuffer)

    self.logger.debug("Filled up the buffer with line")
    print(str(self.ibuffer, encoding="UTF-8"))

The ByteUtils.ends_with_crlf function simply checks the last two bytes of the buffer for \x0D\x0A. The first question is the main one (answer is based on this), but any other ideas/tips are appreciated. Thanks.

+4  A: 

TCP is a stream, and you are not guaranteed that your buffer will not contain the end of one message and the beginning of the next. So, checking for \n\r at the end of the buffer will not work as expected in all situations. You have to check each byte in the stream.

And, I would strongly recommend that you use Twisted instead of asyncore. Something like this (from memory, might not work out of the box):

from twisted.internet import reactor, protocol
from twisted.protocols.basic import LineReceiver


class MyHandler(LineReceiver):

    def lineReceived(self, line):
        print "Got line:", line


f = protocol.ClientFactory()
f.protocol = MyHandler
reactor.connectTCP("127.0.0.1", 4711, f)
reactor.run()
truppo
I know that the server sends through "lines" which all end with CRLF though, so I'm sure that at one point in time the buffer will be terminated (unless something goes wrong at the server end, which would send this into some form of buffer overflow in no time I guess).I've read about Twisted, but it's not out for Python 3 yet AFAIK, and would probably be overkill for this.
pHk
You are still at risk to get multiple lines instead of one with your current code.
truppo
yay! +1 to twisted!
nosklo
I rarely recommend Twisted, but, yes, it is vastly better than the ancient asyncore framework!
Brandon Craig Rhodes
+4  A: 

It's even simpler -- look at asynchat and its set_terminator method (and other helpful tidbits in that module). Twisted is orders of magnitude richer and more powerful, but, for sufficiently simple tasks, asyncore and asynchat (which are designed to interoperate smoothly) are indeed very simple to use, as you've started observing.

Alex Martelli
I tried asynchat first, but ran into a bit of trouble which I couldn't solve straight away (something to do with the buffer), so I reverted back to asyncore.
pHk
boo! -1 to asyncore/asynchat
nosklo
+1 on using asyncore/asynchat instead of Twisted for such simple tasks.
Denis Otkidach
@Denis: I think the twisted example above is **very simple** and straightforward, don't you?
nosklo