views:

174

answers:

2

I've got a chat program which pushes JSON data from Apache/PHP to Node.js, via a TCP socket:

// Node.js (Javascript)
phpListener = net.createServer(function(stream)
{
    stream.setEncoding("utf8");
    stream.on("data", function(txt)
    {
        var json = JSON.parse(txt);

        // do stuff with json
    }
}
phpListener.listen("8887", 'localhost');

// Apache (PHP)
$sock = stream_socket_client("tcp://localhost:8887");
$written = fwrite($sock, $json_string);
fclose($sock);

The problem is, if the JSON string is large enough (over around 8k), the output message gets split into multiple chunks, and the JSON parser fails. PHP returns the $written value as the correct length of the string, but the data event handler fires twice or more.

Should I be attaching the function to a different event, or is there a way to cache text across event fires, in a way that won't succumb to race conditions under heavy load? Or some other solution I haven't thought of?

Thanks!

+2  A: 

You should try using a buffer, to cache the data, as Node.js tends to split data in order to improve performance.

http://nodejs.org/api.html#buffers-2

you can buffer all your request, and then call the function with the data stored at it.

Sebastian Oliva
The solution is good but it is not Node.js that is doing the splitting, it is either the OS on the server or the OS on the client or the modem/router on either end or a router at your ISP or a router along the way. This is just how the internet works. You can configure the OS on both the client and server to use jumbo packets to reduce fragmentation but you cannot guarantee that the network won't fragment the packets (unless of course, you're both on the same LAN).
slebetman
Hi, you are right in that most of the time is either the OS or the server that splits stuff, but also node.js splits the input requests for performance, as is noted on the documentation. I hope IPv6's jumbo packages can help us avoid it, but my guess is that servers will keep splitting stuff for performance reasons anyway (at least on the Web)
Sebastian Oliva
+1  A: 

TCP sockets don't handle buffering for you. How could it? It doesn't know what application layer protocol you are using and therefore has no idea what a "message" is. It is up to you to design and implement another protocol on top of it and handle any necessary buffering.

But, Node.js does have a built in application layer protocol on top of TCP that does automatically handle the buffering for you: the http module. If you use the http module instead of the tcp module for this you won't need to worry about packet fragmentation and buffering.

slebetman
No, the http module doesn't buffer the full message for you either. Node is built around streaming data. Because it's more efficient. So you have to do your own buffering at all times or use a higher level framework that will do it for you.
Marco