views:

56

answers:

2

For the first time I'm trying to implement some Network Protocol over TCP/IP. I've designed one but I'm not sure if it's efficient.

Design

So here is my idea: after client opening TCP/IP connection with server, every time it wants to make request, first of all it sends size of request followed by some separator character (new line or space) and after that actual request (same principal is used in HTTP and I think in most cases this idea is used).

For example if client wants to send GET ASD, it will actually send 7 GET ASD (assuming that space is separator).

For server side, for every client server has buffer in which it saves incoming requests. Whenever it gets some new chunk of characters from client, server will append it to corresponding client buffer. After that server will try to get content length of request (in this example 7) and checks if rest of the buffer's length is more or equal to it. If it is, server will get actual content of request, processes and remove it from the buffer.

Implementation

This all was about protocol design, now some notes about actual implementation: I think main problem here is effectively implementing and managing buffers.

I think buffer with size of 2 * MAX_SIZE_OF_ONE_REQUEST will be enough to serve one client because chunks received by server can simultaneously contain end of the first request and beginning of second one. This is my assumption, if I'm wrong and we need more or less space please let me know why.

I think there are two ways of storing requests in buffer until they are served:

  1. Whenever server receives new chunk of characters, server will append it to the right side of the buffer. As soon as buffer will contain complete request, server will process it and move left all of the rest of the buffer at the beginning of buffer space.

  2. Some cyclic buffer which doesn't move buffer at the beginning after processing request.

This is my thoughts about implementing buffers with async I/O in mind (server will use epoll/kqueue/select to receive requests from clients). I think if server won't use async I/O for communication with clients then implementing buffer will be much much more simpler.

Also I haven't decided how server should behave when it receives malformed request. Should it close connection with client?

Maybe I've written to much but I'm really interested in this topic and want to learn as much as possible. I think there are many like me, so any real world problems about this topic and best practices to solve them will be very helpful.1.

+1  A: 

Do you need a human readable protocol? If not then I'd suggest a binary one, start with x bytes of command length and then form the rest of the message as you see fit; of course the rest of the message could be text if you prefer... IMHO this is MUCH easier to deal with on the server as you don't need to scan all of the input bytes to determine when the message ends.

Since you know the (fixed) number of bytes that you need to determine the length of the message you can ignore all messages until they're that long. You (probably) have a reasonable max message size and so can work in terms of buffers that can accommodate all messages. This means that you can accumulate reads into a single buffer until you have a complete message, no copying, no moving. Once you have a complete message you can pass this buffer on for processing and begin reading into a new one. Reference count the buffers and they can go back to the pool when you are finished using them.

To guard against denial of service attacks you should have a timeout on your read data. No complete message in x and you disconnect.

Malformed message, you disconnect. IMHO Postel has a lot to answer for; IMHO protocols are better when you're pedantic that people get things RIGHT and accept nothing less...

Message size larger than you allow, you disconnect.

I talk about TCP message framing issues here and here with regards to length prefixed and line based (sequence terminated) protocols though the discussion is focused on my free pluggable server platform, WASP so it may or may not be useful to you.

To be honest, this is the easy bit. The complex part is designing the actual protocol that allows the clients and the server to converse efficiently about the problem space... However, getting this bit wrong can lead to interesting issues once you get onto the more complex bits...

Len Holgate
+2  A: 

First off I would check RFCs and ITU.T specs for a protocol that does what you want to do. If you don't find one you'll at least be able to see how other protocols were designed and read some rationale behind it.

Look at XDR or BER (ASN.1) for binary encoding of data. There are libs for that which take care of endiannes and alignment mess. For example wrapping each packet in XDR opaque lets you design the server more efficiently (1 TCP frontend module, which sends unprocessed opaques to proper handlers without having to know what each packet means).

As mentioned by Len Holgate be sure to specify what should happen in special cases (malformed packets, no response) or should there be any keep-alive packets? If so, how often? Should there be some client-server negotiation.

Oh, and don't forget to include protocol version in the hello packet. Getting an issue ticket with "My client app says I don't support version 2 of the protocol" is better than "Some of the client works fine, but when I try to receive the dataset all I get are random numbers!".

Makdaam
+1 for protocol version, +2 if it were possible for looking at other people's specs...
Len Holgate