views:

1100

answers:

4

Hi folks,

I currently try to implement a simple HTTP-server for some kind of comet-technique (long polling XHR-requests). As JavaScript is very strict about crossdomain requests I have a few questions:

  1. As I understood any apache worker is blocked while serving a request, so writing the "script" as a usual website would block the apache, when all workers having a request to serve. --> Does not work!
  2. I came up with the idea writing a own simple HTTP server only for serving this long polling requests. This server should not be blocking, so each worker could handle many request at the same time. As my site also contains content / images etc and my server does not need to server content I started him on a different port then 80. The problem now is that I can't interact between my JavaScript delivered by my apache and my comet-server running on a different port, because of some crossdomain restrictions. --> Does not work!
  3. Then I came up with the idea to use mod_proxy to map my server on a new subdomain. I really don't could figure out how mod_proxy works but I could imagine that I know have the same effect as on my first approach?

What would be the best way to create these kind of combination this kind of classic website and these long-polling XHR-requests? Do I need to implement content delivery on my server at my own?

+1  A: 

I'm pretty sure using mod_proxy will block a worker while the request is being processed.

If you can use 2 IPs, there is a fairly easy solution. Let's say IP A is 1.1.1.1 and IP B is 2.2.2.2, and let's say your domain is example.com.

This is how it will work:

-Configure Apache to listen on port 80, but ONLY on IP A.

-Start your other server on port 80, but only on IP B.

-Configure the XHR requests to be on a subdomain of your domain, but with the same port. So the cross-domain restrictions don't prevent them. So your site is example.com, and the XHR requests go to xhr.example.com, for example.

-Configure your DNS so that example.com resolves to IP A, and xhr.example.com resolves to IP B.

-You're done.

This solution will work if you have 2 servers and each one has its IP, and it will work as well if you have one server with 2 IPs.

If you can't use 2 IPs, I may have another solution, I'm checking if it's applicable to your case.

FWH
I am interested in the idea with only using one IP.
Hippo
I don't think the browser security model allows code loaded from example.com to send XHRs to xhr.example.com. You have to play games with document.domain and IFrames, and then it's not portable. -- http://www.fettig.net/weblog/2005/11/28/how-to-make-xmlhttprequest-connections-to-another-server-in-your-domain/
Tim Sylvester
+1  A: 

This is a difficult problem. Even if you get past the security issues you're running into, you'll end up having to hold a TCP connection open for every client currently looking at a web page. You won't be able to create a thread to handle each connection, and you won't be able to "select" on all the connections from a single thread. Having done this before, I can tell you it's not easy. You may want to look into libevent, which memcached uses to a similar end.

Up to a point you can probably get away with setting long timeouts and allowing Apache to have a huge number of workers, most of which will be idle most of the time. Careful choice and configuration of the Apache worker module will stretch this to thousands of concurrent users, I believe. At some point, however, it will not scale up any more.

I don't know what you're infrastructure looks like, but we have load balancing boxes in the network racks called F5s. These present a single external domain, but redirect the traffic to multiple internal servers based on their response times, cookies in the request headers, etc.. They can be configured to send requests for a certain path within the virtual domain to a specific server. Thus you could have example.com/xhr/foo requests mapped to a specific server to handle these comet requests. Unfortunately, this is not a software solution, but a rather expensive hardware solution.

Anyway, you may need some kind of load-balancing system (or maybe you have one already), and perhaps it can be configured to handle this situation better than Apache can.

I had a problem years ago where I wanted customers using a client-server system with a proprietary binary protocol to be able to access our servers on port 80 because they were continuously having problems with firewalls on the custom port that the system used. What I needed was a proxy that would live on port 80 and direct the traffic to either Apache or the app server depending on the first few bytes of what came across from the client. I looked for a solution and found nothing that fit. I considered writing an Apache module, a plugin for DeleGate, etc., but eventually rolled by own custom content-sensing proxy service. That, I think, is the worst-case scenario for what you're trying to do.

Tim Sylvester
A: 

To answer the specific question about mod-proxy: yes, you can setup mod_proxy to serve content that is generated by a server (or service) that is not public facing (i.e. which is only available via an internal address or localhost).

I've done this in a production environment and it works very, very well. Apache forwarding some requests to Tomcat via AJP workers, and others to a GIS application server via mod proxy. As others have pointed out, cross-site security may stop you working on a sub-domain, but there is no reason why you can't proxy requests to mydomain.com/application


To talk about your specific problem - I think really you are getting bogged down in looking at the problem as "long lived requests" - i.e. assuming that when you make one of these requests that's it, the whole process needs to stop. It seems as though your are trying to solve an issue with application architecture via changes to system architecture. In-fact what you need to do is treat these background requests exactly as such; and multi-thread it:

  • Client makes the request to the remote service "perform task X with data A, B and C"
  • Your service receives the request: it passes it onto a scheduler which issues a unique ticket / token for the request. The service then returns this token to the client "thanks, your task is in a queue running under token Z"
  • The client then hangs onto this token, shows a "loading/please wait" box, and sets up a timer that fires say, for arguments, every second
  • When the timer fires, the client makes another request to the remote service "have you got the results for my task, it's token Z"
  • You background service can then check with your scheduler, and will likely return an empty document "no, not done yet" or the results
  • When the client gets the results back, it can simply clear the timer and display them.

So long as you're reasonably comfortable with threading (which you must be if you've indicated you're looking at writing your own HTTP server, this shouldn't be too complex - on top of the http listener part:

  • Scheduler object - singleton object, really that just wraps a "First in, First Out" stack. New tasks go onto the end of the stack, jobs can be pulled off from the beginning: just make sure that the code to issue a job is thread safe (less you get two works pulling the same job from the stack).
  • Worker threads can be quite simple - get access to the scheduler, ask for the next job: if there is one then do the work send the results, otherwise just sleep for a period, start over.

This way, you're never going to be blocking Apache for longer than needs be, as all you are doing is issues requests for "do x" or "give me results for x". You'll probably want to build some safety features in at a few points - such as handling tasks that fail, and making sure there is a time-out on the client side so it doesn't wait indefinitely.

iAn
A: 

For number 2: you can get around crossdomain restrictions by using JSONP.

null_radix