views:

445

answers:

3

I've started tinkering with Node.js HTTP server and really like to write server side Javascript but something is keeping me from starting to use Node.js for my web application.

I understand the whole async I/O concept but I'm somewhat concerned about the edge cases where procedural code is very CPU intensive such as image manipulation or sorting large data sets.

As I understand it, the server will be very fast for simple web page requests such as viewing a listing of users or viewing a blog post. However, if I want to write very CPU intensive code (in the admin back end for example) that generates graphics or resizes thousands of images, the request will be very slow (a few seconds). Since this code is not async, every requests coming to the server during those few seconds will be blocked until my slow request is done.

One suggestion was to use Web Workers for CPU intensive tasks. However, I'm afraid web workers will make it hard to write clean code since it works by including a separate JS file. What if the CPU intensive code is located in an object's method? It kind of sucks to write a JS file for every method that is CPU intensive.

Another suggestion was to spawn a child process, but that makes the code even less maintainable.

Any suggestions to overcome this (perceived) obstacle? How do you write clean object oriented code with Node.js while making sure CPU heavy tasks are executed async?

+9  A: 

This is misunderstanding of the definition of web server -- it should only be used to "talk" with clients. Heavy load tasks should be delegated to standalone programs (that of course can be also written in JS).
You'd probably say that it is dirty, but I assure you that a web server process stuck in resizing images is just worse (even for lets say Apache, when it does not block other queries). Still, you may use a common library to avoid code redundancy.

EDIT: I have come up with an analogy; web application should be as a restaurant. You have waiters (web server) and cooks (workers). Waiters are in contact with clients and do simple tasks like providing menu or explaining if some dish is vegetarian. On the other hand they delegate harder tasks to the kitchen. Because waiters are doing only simple things they respond quick, and cooks can concentrate on their job.

Node.js here would be a single but very talented waiter that can process many requests at a time, and Apache would be a gang of dumb waiters that just process one request each. If this one Node.js waiter would begin to cook, it would be an immediate catastrophe. Still, cooking could also exhaust even a large supply of Apache waiters, not mentioning the chaos in the kitchen and the progressive decrease of responsitivity.

mbq
Well, in an environment where web servers are multi-threaded or multi-process and can handle more than one concurrent request, it is very common to spend a couple of seconds on a single request. People have come to expect that. I'd say that the misunderstanding is that node.js is a "regular" web server. Using node.js you have to adjust your programming model a bit, and that includes pushing "long-running" work out to some asynchronous worker.
Thilo
@Thilio Right, but it is still a bad practice. While Node is just making bad practices work bad, I think it is an advantage of its approach.
mbq
That's what I was my original plan when discovering Node.js but it seems the general consensus is to keep everything in one process while trying to make everything non blocking. Perhaps I should not blindly follow the general consensus and spawn a child process for every request?
Olivier Lalonde
Don't spawn a child process for every request (that defeats the purpose of node.js). Spawn workers from inside your heavy requests only. Or route your heavy background work to something other than node.js.
Thilo
+2  A: 

What you need is a task queue! Moving your long running tasks out of the web-server is a GOOD thing. Keeping each task in "separate" js file promotes modularity and code reuse. It forces you to think about how to structure your program in a way that will make it easier to debug and maintain in the long run. Another benefit of a task queue is the workers can be written in a different language. Just pop a task, do the work, and write the response back.

something like this http://github.com/defunkt/resque

Here is an article from github about why they built it http://github.com/blog/542-introducing-resque

Tim
A: 

Couple of approaches you can use.

As @Tim notes, you can create an asynchronous task that sits outside or parallel to your main serving logic. Depends on your exact requirements, but even cron can act as a queueing mechanism.

WebWorkers can work for your asycn processes but they are currently not supported by node.js. There are a couple of extensions that provide support, for example: http://github.com/cramforce/node-worker

You still get you can still reuse modules and code through the standard "requires" mechanism. You just need to insure that the initial dispatch to the worker passes all the information needed to process the results.

Toby Hede