tags:

views:

237

answers:

8

I'm looking for some data to help me decide which would be the better/faster for communication between two independent processes on Linux:

  • TCP
  • Named Pipes

Which is worse: the system overhead for the pipes or the tcp stack overhead?


Updated exact requirements:

  • only local IPC needed
  • will mostly be a lot of short messages
  • no cross-platform needed, only Linux
A: 

I think the pipes will be a little lighter, but I'm just guessing.

But since pipes are a local thing, there's probably a lot less complicated code involved.

Other people might tell you to try and measure both to find out. It's hard to go wrong with this answer, but you may not be willing to invest the time. That would leave you hoping my guess is correct ;)

Carl Smotricz
+4  A: 

In the past I've used local domain sockets for that sort of thing. My library determined whether the other process was local to the system or remote and used TCP/IP for remote communication and local domain sockets for local communication. The nice thing about this technique is that local/remote connections are transparent to the rest of the application.

Local domain sockets use the same mechanism as pipes for communication and don't have the TCP/IP stack overhead.

Richard Pennington
Do you have more info on that? Any Links? Which library did you use?
brandstaetter
Sorry it was a library I wrote myself for an employer. I can't provide a link.
Richard Pennington
Ah, I understand. I'll take a look at the domain sockets, those sound interesting. Thanks!
brandstaetter
The correct name of Unix Domain Sockets, according to Wikipedia, is POSIX Local IPC Sockets.
Jaywalker
+2  A: 

There will be more overhead using TCP - that will involve breaking the data up into packets, calculating checksums and handling acknowledgement, none of which is necessary when communicating between two processes on the same machine. Using a pipe will just copy the data into and out of a buffer.

Mike Seymour
Is the overhead really that much?
jkp
The overhead is a lot more significant if the processor is only 100MHZ, which is where I often work. It probably isn't worth the trouble if you are in the GHZ range and don't care about top performance.
Richard Pennington
It's not that the overhead is that much -- the MTU for 127.0.0.1 is usually pretty large, so it doesn't have to break it into as many packets -- but it's still several more steps than a named pipe. The more important question is: Will you ever want this communication to be between different computers? If so, then use the sockets, since named pipes simply won't work between different machines. There are also UNIX domain sockets, which may perform as well as a pipe while working like a socket. Finally, another advantage of named pipes is access control and namespaces (a path vs. an IP/port).
Mike D.
Yeah, we already use sockets for communication between machines, I'm looking for communication between the local management process and worker programs.
brandstaetter
I agree with the unix domain sockets (see my answer). They are called local domain sockets in POSIX.
Richard Pennington
I don't think Linux either calculates or checks the checksums for TCP connections which are local to the box (use wireshark - every local IP frame has the wrong tcp checksum on)
MarkR
+3  A: 

I don't really think you should worry about the overhead (which will be ridiculously low). Did you make sure using profiling tools that the bottleneck of your application is likely to be TCP overhead?

Anyways as Carl Smotricz said, I would go with sockets because it will be really trivial to separate the applications in the future.

Andreas Bonini
+1: in the past we've used shared memory (currently using boost.interprocess to implement this) but in the end it's more trouble than it's worth. The overhead of sockets / TCP is surely so low on the loopback that it's really not worth worrying about. Of-course, profile your particular use case to be sure.
jkp
+2  A: 

I discussed this in an answer to a previous post. I had to compare socket, pipe, and shared memory communications. Pipes were definitely faster than sockets (maybe by a factor of 2 if I recall correctly ... I can check those numbers when I return to work). But those measurements were just for the pure communication. If the communication is a very small part of the overall work, then the difference will be negligible between the two types of communication.

Edit Here are some numbers from the test I did a few years ago. Your mileage may vary (particularly if I made stupid programming errors). In this specific test, a "client" and "server" on the same machine echoed 100 bytes of data back and forth. It made 10,000 requests. In the document I wrote up, I did not indicate the specs of the machine, so it is only the relative speeds that may be of any value. But for the curious, the times given here are the average cost per request:

  • TCP/IP: .067 ms
  • Pipe with I/O Completion Ports: .042 ms
  • Pipe with Overlapped I/O: .033 ms
  • Shared Memory with Named Semaphore: .011 ms
Mark Wilkins
I'm looking for extremely fast communication for short messages, which control the flow of my programs (i.e. program asks after each iteration if it's ok to continue)
brandstaetter
If you use TCP/IP, make sure to turn off the Nagle algorithm.
Richard Pennington
For the short messages (and especially if there will be a large number of messages) the pipe will probably be measurably faster. And if you want the extra pain, you can probably get even more speed by using simple shared memory. I will try to remember to get the pipe versus tcp/ip numbers when I get in to work and provide them here. As I mentioned in my other post, it was a few years ago, but I suspect the relative speeds would be similar now.
Mark Wilkins
+2  A: 

Two things to consider:

  1. Connection setup cost
  2. Continuous Communication cost

On TCP:

(1) more costly - 3way handshake overhead required for (potentially) unreliable channel.

(2) more costly - IP level overhead (checksum etc.), TCP overhead (sequence number, acknowledgement, checksum etc.) pretty much all of which aren't necessary on the same machine because the channel is supposed to be reliable and not introduce network related impairments (e.g. packet reordering).

But I would still go with TCP provided it makes sense (i.e. depends on the situation) because of its ubiquity (read: easy cross-platform support) and the overhead shouldn't be a problem in most cases (read: profile, don't do premature optimization).

Updated: if cross-platform support isn't required and the accent is on performance, then go with named/domain pipes as I am pretty sure the platform developers will have optimize-out the unnecessary functionality deemed required for handling network level impairments.

jldupont
Cross-platform and communication is not needed in this case.
brandstaetter
I understand: then names pipes are surely lower cost in terms of performance.
jldupont
+2  A: 

I don't know if this suites you, but a very common way of IPC (interprocess communication) under linux is by using the shared memory. It's actually ultra fast (I didn't profiled this, but this is just shared data on RAM with strong processing around it).

The main problem around this approuch is the semaphore, you must build a little system around it so you must make sure a process is not writing at the same time the other one is trying to read.

A very simple starter tutorial is at here

This is not as portable as using sockets, but the concept would be the same, so if you're migrating this to Windows, you will just have to change the shared memory create/attach layer.

Douglas Gemignani
+1 for the tutorial link, thanks!
brandstaetter
+1: I have been using shared memory for IPC in MS Windows for quite some time. It works really well; the only thing you need to be aware of is that shared memory is kind-of-a fixed size array; so you have to have a mechanism to synchronize access to it from different processes.
Jaywalker
Just another comment, under Windows you could search at MSDN for CreateFileMapping/OpenFileMapping and MapViewOfFile
Douglas Gemignani
+1  A: 

unix domain socket is a very goog compromise. Not the overhead of tcp, but more evolutive than the pipe solution. A point you did not consider is that socket are bidirectionnal, while named pipes are unidirectionnal.

shodanex