tags:

views:

862

answers:

4

Hi all, and thanks for taking a look at the question.

The background
I have several machines that continuously spawn multiple (up to 300) PHP console scripts in a very short time frame. These scripts run quickly (less than a second) and then exit. All of these scripts need read only access to a large trie structure which would be very expensive to load into memory each time each one of the scripts runs. The server runs Linux.

My solution
Create a C daemon that keeps the trie structure in memory and receives requests from the PHP clients. It would receive a request from every one of the PHP clients, perform the lookup on the memory structure and respond with the answer, saving the PHP scripts from doing that work. Both requests and responses are short strings (no longer than 20 characters)

My problem
I am very new to C daemons and inter process communication. After much research, I have narrowed the choices down to Message Queues and Unix domain sockets. Message Queues seem adequate because I think (I may be wrong) that they queue up all of the requests for the daemon to answer them serially. Unix domain sockets seem to be easier to use, though. However, I have various questions I have not been able to find answers to:

  1. How can a PHP script send and receive messages or use a UNIX socket to communicate with the daemon? Conversely how does the C daemon keep track of which PHP process it has to send a reply to?
  2. Most examples of daemons I have seen use an infinite while loop with a sleep condition inside. My daemon needs to service many connections that can come at any time, and response latency is critical. How would the daemon react if the PHP script sends a request while it is sleeping? I have read about poll and epoll, would this be the correct way to wait for a received message?
  3. Each PHP process will always send one request, and then will wait to receive a response. I need to make sure that if the daemon is down / unavailable, the PHP process will wait for a response for a set maximum time, and if no answer is received will continue regardless instead of hanging. Can this be done?

The actual lookup of the data structure is very fast, I don't need any complex multi-threading or similar solution, as I believe handling the requests in a FIFO manner will be enough. I also need to keep it simple stupid, as this is a mission critical service, and I am fairly new to this type of program. (I know, but I really have no way around this, and the learning experience will be great)

I would really appreciate code snippets that shine some light into the specific questions that I have. Links to guides and pointers that will further my understanding into this murky world of low level IPC are also welcome.

Thanks for your help!

A: 

Although I have never tried it, memcached along with an appropriate PHP extension ought to eliminate most of the grunt work.

Clarification: I was implicitly assuming that if you did this, you would put the individual leaves of the trie into the memcache using flattened keys, ditching the trie. The feasibility and desirability of this approach, of course, depends on many factors, first and foremost being the data source.

Sinan Ünür
Just the cost of serializing / unserializing the huge data structure every time I retrieved it from Memcached is the reason I'm trying out this other solution, but thanks! It might be the final solution I will use.
Alex
Depends how big the trie is. Memcached won't store anything larger than a meg. And then you've got the overhead of deserializing the object on every request.
Frank Farmer
+2  A: 

I suspect Thrift is what you want. You'd have to write a little glue code to do PHP <-thrift-> C++ <-> C, but that would probably be more robust than rolling your own.

scotchi
Thrift is definitely the industry standard nowadays — http://wiki.apache.org/thrift/PoweredBy — and was initially conceived for exactly this use (hooking up PHP to C daemons). If you're "very new to C daemons and inter process communication" you should definitely take a look; it'll make a nice, fast libevent-based C server for you and handle all the serialization between PHP and C.
cce
I took a look at Thrift, which was new to me, and I must say I'm impressed! It seems like it automatically generates a great daemon with all of the IPC taken care of and I only have to add the actual functionality! Impressive, thanks!
Alex
A: 

The "problem" (maybe not?) is that there can certainly be many consumers/producers on the SysV MQs. Though perfectly possible for what you're doing if you don't necessarily have an m:n need on the producer:consumer to resources model, you have a request/response model here.

You can get some strange hangups with SysV MQ as it is.

First, are you sure that INET sockets aren't fast enough for you? A quick PHP example using unix domain sockets is at http://us.php.net/socket-create-pair (just as code example of course, use socket_create() for the PHP endpoint).

Xepoch
A: 

You could also load the data structure into shared memory using PHP's shared memory functions http://www.php.net/manual/en/book.shmop.php.

Oh, it's not obvious from the documentation but the coordinating variable is $key in shmop_open. Every process needing access to the shared memory should have the same $key. So, one process creates the shared memory with $key. The other processes then can access that shared memory if they use the same $key. I believe you can choose whatever you like for $key.

Fredrick Pennachi