views:

79

answers:

5

I wonder if it's a good idea letting local applications (in the same server) communicate with each other entirely through Restful API?

I know this is not an uncommon thing, cause we already have applications like CouchDB that is using HTTP REST for communication, even with local applications.

But I want to take it to a higher level by creating applications that act like modules for a bigger application, which could also be a module for another application and so on. In other words, there will be a lot of local applications/modules that are communicating with Restful API.

In this way these applications/modules could be in any language and they could communicate over the wire between servers.

But I have some questions:

  • Is this a good idea?
  • Will the data transfer between them be slow?
  • If I do this, then each application/module have to be a HTTP server right? So if my application uses 100 applications/modules then each one of these have to be a local HTTP web server each running on a different port (http://localhost:81, http://localhost:82, http://localhost:83 and so on) right?
  • Any best practices/gotchas that I should know of?
+2  A: 

On using restful solutions for application integration, I believe this is a good idea and professed similar views at another question.

pyfunc
A: 

To be frank I don't think you need 100 servers for 100 applications, maybe just use 100 ports on the same server.

and also, RESTful interface will give you the flexibility to expand the servers and enable load balancing if you want to have the potential to scale up to huge.

Michael Mao
@Michael Mao: Actually I meant 100 local web servers (not 100 physical web servers). I'll edit it to be more explicit. So you are saying this is a good idea, even though I'm aiming at the modular level, not fullstack application?
weng
@weng: sorry I still don't quite get your idea of why having 100 web servers in order to server 100 web applications. I believe most of web apps, if not all, can all be deployed onto the same server and collaborate well, so long as they don't conflict on listening ports.
Michael Mao
@weng: and also, RESTful web service rely on url pattern to match which application you are calling. So multiple apps can work on the same web server sharing the default 80 port.
Michael Mao
@Michael Mao: Now I see why my question was confusing. So I try to be more accurate. What I mean by 100 web servers is actually 100 applications that ARE web servers, not 100 web servers serving information for 100 applications. I have edited my post to make it more clear.
weng
@Michael Mao: If multiple apps are running on the same port, don't I have to have a router in front of these apps. But if I have it like this, then each application won't be a web server, and hence they will not be able to use restful communication. And also, this make an application unable to be used by it's own.
weng
+4  A: 
  • Is this a good idea?

Sure, perhaps.

  • Will the data transfer between them be slow?

Yup! But compared to what? Compared to native, internal calls, absolutely -- it'll be glacial. Compared to some other network API, eh, not necessarily slower.

  • If I do this, then each application/module have to be a HTTP server right? So if my application uses 100 applications/modules I have to have 100 local HTTP web servers up and running each with different port (http://localhost:81, http://localhost:82, http://localhost:83 and so on)?

Nah, no reason to allocate a port per module. All sorts of ways to do this.

  • Any best practices/gotchas that I should know of?

The only way this will succeed is if the services you are talking about are coarse enough. These have to be big, black boxy kinds of services that make the expense of calling them worthwhile. You will be incurring connection costs, data transfer costs, and data marshaling cost on each transaction. So, you want those transactions to be as rare as possible, and you want the payloads to be as large as possible to get the best benefit.

Are you talking actually using the REST architecture or just sending stuff back and forth via HTTP? (These are different things) REST incurs its own costs, include embedded linkages, ubiquitous and common data types, etc.

Finally, you simply may not need to do this. It might well be "kinda cool", a "nice to have", "looks good on the white board", but if, really, don't need it, then don't do it. Simply follow good practices of isolating your internal services so that should you decide later to do something like this, you can just insert the glue layer necessary to manage the communication, etc. Adding remote distribution will increase risk, complexity and lower performance, (scaling != performance) so there should be a Good Reason to do it at all.

That, arguably, is the "best practice" of them all.

Edit -- Response to comment:

So you mean I run ONE web server that handle all incoming requests? But then the modules won't be stand-alone applications, which defeats the whole purpose. I want each one of the modules to be able to run by itself.

No, it doesn't defeat the purpose.

Here's the deal.

Let's say you have 3 services.

At a glance, it would be fair to say that these are three different services, on 3 different machines, running in 3 different web servers.

But the truth is that these can all be running on the SAME machine, on the SAME web server, even down to (to take this to the extreme) running the exact same logic.

HTTP allows you to map all sorts of things. HTTP itself is mechanism of abstraction.

As a client all you care about is the URL to use and the payload to send. What machine it ends up talking to, or what actual code it executes it not the clients problem.

At an architecture level, you have achieved a manner of abstraction and modularization. The URLs let you organize you system is whatever LOGICAL layout you want. The PHYSICAL implementation is distinct from the logical view.

Those 3 services can be running on a single machine served by a single process. On the other hand, they can represent 1000 machines. How many machines do you think respond to "www.google.com"?

You can easily host all 3 services on a single machine, without sharing any code save the web server itself. Making it easy to move a service from its original machine to some other machine.

The host name is the simplest way to map a service to a machine. Any modern web server can service any number of different hosts. Each "virtual host" can service any number of individual service endpoints within the name space of that host. At the "host" level its trivial to relocate code from one machine to another if and when you have to.

You should explore more the capabilities of a modern web server to direct arbitrary requests to actual logic on the server. You'll find them very flexible.

Will Hartung
"Nah, no reason to allocate a port per module. All sorts of ways to do this." Could you elaborate. If each module/application is a local web server, how can they be used without be running on different ports?
weng
As others have mentioned, don't run them on separate web servers. A single web server can handle any number of applications. It's really all based on server resources. Consider a bunch of CGI scripts, all organized in a directory tree. Consider a Java web app server, each WAR individually deployed but each with their own context from the same host url. Or even consider Virtual Hosting, a single web server hosting 100 different hosts, all sharing the same IP and port. These are all techniques for deploying multiple applications on a single server.
Will Hartung
@Will. So you mean I run ONE web server that handle all incoming requests? But then the modules won't be stand-alone applications, which defeats the whole purpose. I want each one of the modules to be able to run by itself.
weng
+1  A: 

Is this a good idea?

Yes. It's done all the time. That's how all database servers work, for example. Linux is packed full of client/server applications communicating through TCP/IP.

Will the data transfer between them be slow?

No. TCP/IP uses localhost as a short-cut to save doing actual network I/O.

The HTTP protocol isn't the best thing for dedicated connections, but it's simple and well supported.

If I do this, then each application/module have to be a HTTP server right?

Not necessarily. Some modules can be clients and not have a server.

So if my application uses 100 applications/modules then each one of these have to be a local HTTP web server each running on a different port (http://localhost:81, http://localhost:82, http://localhost:83 and so on) right?

Yes. That's the way it works.

Any best practices/gotchas that I should know of?

Do not "hard-code" port numbers.

Do not use the "privileged" port numbers (under 1024).

Use a WSGI library and you'll be happiest making all your modules into WSGI applications. You can then use a trivial 2-line HTTP server to wrap your module.

Read this. http://docs.python.org/library/wsgiref.html#examples

S.Lott
A: 

No - it is not a good idea if you don't have a good reason. It is a good idea to layer the code of your application so it can be "rested" at a later stage should you need it. (Or whatever performance improvement is deemed necessary.) The increased deployment complexity of "server based layers" is a good reason to not do it. I would suggest:

  • Write a well structured application with good, clean code.
  • Test it with expected production loads
  • if needed - refactor into layers that are servers - but.....

A better approach is to load balance the entire application. If you are doing something like rails with no state in the app server it should be no problem to run several instances in parallel.

If you are looking for complexity - disregard my answer. :-)