views:

923

answers:

3

I have created a client-server program with Perl using IO::Socket::INET. I access server through CGI based site. My server program will run as daemon and will accept multiple simultaneous connections. My server process consumes about 100MB of memory space (9 large arrays, many arrays...). I want these hashes to reside in memory and share them so that I don't have to create them for every connection. Hash creation takes 10-15 seconds.

Whenever a new connection is accepted through sockets, I fork a new process to take care of the processing for each connection received. Since parent process is huge, every time I fork, processor tries to allocate memory to a new child, but due to limited memory, it takes large time to spawn a new child, thereby increasing the response time. Many times it hangs down even for a single connection.

Parent process creates 9 large hashes. For each child, I need to refer to one or more hashes in read-only mode. I will not update hashes through child. I want to use something like copy-on-write, by which I can share whole 100mb or whole global variables created by parent with all child? or any other mechanism like threads. I expect the server will get minimum 100 request per second and it should be able to process all of them in parallel. On an average, a child will exit in 2 seconds.

I am using Cygwin on Windows XP with only 1GB of RAM. I am not finding any way to overcome this issue. Can you suggest something? How can I share variables and also create 100 child processes per second and manage them and synchronize them,

Thanks.

+3  A: 

Instead of forking there are two other approaches to handle concurrent connections. Either you use threads or a polling approach.

In the thread approach for each connection a new thread is created that handles the I/O of a socket. A thread runs in the same virtual memory of the creating process and can access all of its data. Make sure to properly use locks to synchronize write access on your data.

An even more efficient approach is to use polling via select(). In this case a single process/thread handles all sockets. This works under the assumption that most work will be I/O and that the time spend with waiting for I/O requests to finish is spent handling other sockets.

Go research further on those two options and decide which one suits you best.

See for example: http://www.perlfect.com/articles/select.shtml

+1  A: 

If you have that much data, I wonder why you don't simply use a database?

innaM
+2  A: 

This architecture is unsuitable for Cygwin. Forking on real unix systems is cheap, but on fake unix systems like Cygwin it's terribly expensive, because all data has to be copied (real unices use copy-on-write). Using threads changes the memory usage pattern (higher base usage, but smaller increase per thread), but odds are it will still be inefficient.

I would advice you to use a single-process approach using polling, and maybe non-blocking IO too.

Leon Timmermans
Isn't the emulated Windows version of Perl's fork a thread in the same process? This bit me recently and that's what I recall reading.
brian d foy
Yes, they are both implemented as Win32 threads, but a fake process isn't implemented the same way as an ithread from a Perl perspective.
Leon Timmermans