views:

464

answers:

5

I am working on distributing a stand-alone app. Each instance of the app has to be able to send and receive queries.

Requirements:

  1. Language - C++
  2. Scale - small. May be 5 instances at a time
  3. Platform Independent
  4. Volume of data transferred is expected to be high(Raw images in the worst case)

I don't want to use RPC because it needs a registry service running. I think CORBA and SOAP would be too much of an overhead. I kind of decided to use a custom protocol, but just want to hear if there's anything better.

Thanks.

+7  A: 

Protocol Buffers sound like a good fit, supported in C++, cross-platform, designed for high-performance.

Mike McQuaid
+4  A: 

MPI was made for this and is certainly easier to use than Corba etc.
And it scales when you discover that your small scale distributed app becomes a very large scale distributed app!

Martin Beckett
MPI should be considered. If the volume of data is high enough, and the amount of processing is low enough, the bottle neck will probably be the network.
KeithB
The bottle neck is normally trying to understand the Corba docs! At least the MPI function calls are simple enough.
Martin Beckett
Thankfully, I've never had to deal with Corba. The MPI calls are simple, and there aren't that many of them. There are some subtleties, for instance matching sends and recvs, but it is generally manageable.
KeithB
+4  A: 

Why not use http POST?

  • Lightweight as you need (open a socket, send POST string), or if you want robustness use an http library.
  • Easy to manage permissions on the serverside (just use apache or iis)
  • Built in logging (on the webserver side)
  • No scaling problems (webservers have solved these problems)
  • Typically systems don't require permissions for http sockets (xp does for raw sockets).
  • Key/value pairs for identifying fields and data.
  • You can test it using firefox plugins.
  • If speed is a concern you can easily set timeouts and resend.
  • You don't have to worry about firewalls since http is almost always allowed by default.
  • Simple to debug with a port sniffer.
  • All of the server side code and most of client side code has been written for you.
e5
+3  A: 

I'd suggest using the HTTP protocol with a small webserver actually embedded in your application. This is very easy to get working, and there are lots of good embedable webservers out there - I personally recommend Mongoose.

anon
+1 for interesting idea. Wxplain, why put the web server on the client side. What are the advantages to having an app "phone home", vs being able to "an call the app from home"?
e5
anon
Each instance is a server and a client.
Sundar
I think this illustrates that the server/client dichotomy doesn't fully express what is going on anymore. For instance is a web browser connected to an ajax site a client or a server. Both parties initiate communication and provide data to each other. Client/server often used tomean one(server) to many(client), rather than the traditional meaning of requester/provider.
e5
+3  A: 

I'd have a look at the Spread Toolkit. Well it's C, but C++ bindings exist, and it's also easy to roll your own. Yours sounds pretty much like some projects where I've used it with great success (though without any bindings).

From the project's website:

Spread is an open source toolkit that provides a high performance messaging service that is resilient to faults across local and wide area networks. Spread functions as a unified message bus for distributed applications, and provides highly tuned application-level multicast, group communication, and point to point support. Spread services range from reliable messaging to fully ordered messages with delivery guarantees.

Spread can be used in many distributed applications that require high reliability, high performance, and robust communication among various subsets of members. The toolkit is designed to encapsulate the challenging aspects of asynchronous networks and enable the construction of reliable and scalable distributed applications.

Spread consists of a library that user applications are linked with, a binary daemon which runs on each computer that is part of the processor group, and various utility and demonstration programs.

Some of the services and benefits provided by Spread:

  • Reliable and scalable messaging and group communication.
  • A very powerful but simple API simplifies the construction of distributed architectures.
  • Easy to use, deploy and maintain.
  • Highly scalable from one local area network to complex wide area networks.
  • Supports thousands of groups with different sets of members.
  • Enables message reliability in the presence of machine failures, process crashes and recoveries, and network partitions and merges.
  • Provides a range of reliability, ordering and stability guarantees for messages.
  • Emphasis on robustness and high performance.
  • Completely distributed algorithms with no central point of failure.

I am aware that based on all this, it sounds like it must be complicated stuff and probably overkill to any small project -- but actually it is not: the basic usage is really simple. Well surely it is complex under the hood, because the problems that the toolkit solves are inherently quite difficult; but at least I never had to look there, just like I never checked how TCP really works, even though I've been using it extensively.

(No, I don't work for the project in any way. Just a happy user.)

Pukku