ansaurus

Question

Answer 1

+5 A:

There are a great number of ways to do this, all involving some form of inter-process communication. Which one you choose will depend on many factors, but some are:

shared memory.
pipes (popen and such).
sockets.

In general, I would probably popen a number of communications sessions in the parent before spawning the children; the parent will know about all five but each child can be configured to use only one.

Shared memory is also a possibility, although you'd probably have to have a couple of values in it per child to ensure communications went smoothly:

a value to store the variable and return value.
a value to store the state (0 = start, 1 = variable ready for child, 2 = variable ready for parent again).

In all cases, you need a way for the children to only pick up their values, not those destined for other children. That may be as simple as adding a value to the shared memory block to store the PID of the child. All children would scan every element in the block but would only process those where the state is 1 and the PID is their PID.

For example:

Main creates shared memory for five children. Each element has state, PID and value.
Main sets all states to "start".
Main starts five children who all attach to the shared memory.
Main stores all their PIDs.
All children start scanning the shared memory for state = "ready for child" and their PID.
Main puts in first element (state = "ready for child", PID = pid1, value = 7).
Main puts in second element (state = "ready for child", PID = pid5, value = 9).
Child pid1 picks up first element, changes value to 49, sets state to "ready for parent"), goes back to monitoring.
Child pid5 picks up second element, changes value to 81, sets state to "ready for parent"), goes back to monitoring.
Main picks up pid5's response, sets that state back to "start.
Main picks up pid1's response, sets that state back to "start.

This gives a measure of parallelism with each child continuously monitoring the shared memory for work it's meant to do, Main places the work there and periodically receives the results.

paxdiablo 2009-02-27 06:28:57

i'd be using shared memory

kylex 2009-02-27 06:30:04

You probably don't need to do the scanning; the children get a copy of the parent process data, so (unlike the vfork() example) the children could look at a counter that tells them which entry is theirs. You could preload the memory with the random values,; then the children know the data is ready.

Jonathan Leffler 2009-02-27 07:59:58

That'll work if you just fork (with parent and child code in the same executable) but not if you exec to bring in the child: the process space is overwritten (including a counter). Not execing generally leaves a lot of dangerous things about that are shared between both forks. Still, it's an option.

paxdiablo 2009-02-27 11:00:38

Answer 2

+2 A:

The nastiest method uses vfork() and lets the different children trample on different parts of memory before exiting; the parent then just adds up the modified bits of memory.

Highly unrecommended - but about the only case I've come across where vfork() might actually have a use.

Just for amusement (mine) I coded this up:

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <time.h>
#include <sys/wait.h>

int main(void)
{
    int i;
    int array[5];
    int square[5];
    long sum = 0;

    srand(time(0));
    for (i = 0; i < 5; i++)
    {
        array[i] = rand();
        if (vfork() == 0)
        {
            square[i] = array[i] * array[i];
            execl("/bin/true", "/bin/true", (char *)0);
        }
        else
            wait(0);
    }

    for (i = 0; i < 5; i++)
    {
        printf("in: %d; square: %d\n", array[i], square[i]);
        sum += square[i];
    }
    printf("Sum: %d\n", sum);
    return(0);
}

This works. The previous trial version using 'exit(0)' in place of 'execl()' did not work; the square array was all zeroes. Example output (32-bit compilation on Solaris 10, SPARC):

in: 22209; square: 493239681
in: 27082; square: 733434724
in: 2558; square: 6543364
in: 17465; square: 305026225
in: 6610; square: 43692100
Sum: 1581936094

Sometimes, the sum overflows - there is much room for improvement in the handling of that.

The Solaris manual page for 'vfork()' says:

Unlike with the fork() function, the child process borrows the parent's memory and thread of control until a call to execve() or an exit (either abnormally or by a call to _exit() (see exit(2)). Any modification made during this time to any part of memory in the child process is reflected in the parent process on return from vfork(). The parent process is suspended while the child is using its resources.

That probably means the 'wait()' is unnecessary in my code. (However, trying to simplify the code seemed to make it behave indeterminately. It is rather crucial that i does not change prematurely; the wait() does ensure that synchronicity. Using _exit() instead of execl() also seemed to break things. Don't use vfork() if you value your sanity - or if you want any marks for your homework.)

Jonathan Leffler 2009-02-27 06:43:05

Interesting. You also don't need `pid_t pid`, by the way. On Mac OS X, I get incorrect results, too. It printed 2 blocks of output, and some of the squares were obviously wrong. After the 2nd block of output, the process is hung (but taking no cycles). Haven't tried to debug it though.

Craig S 2009-02-27 07:17:25

@Craig S: yup - you're right about pid. It is a hangover from an experiment while debugging. I'll remove it. Fundamentally, vfork() is evil; I'm amused to have found a potential (but not good!) use for it.

Jonathan Leffler 2009-02-27 07:19:08

Yes, and vfork() is not very portable either. It is amusing to see it in action. Maybe tomorrow I'll come up with my own solution that works on Mac OS X. I also wonder what would happen if you called _exit(), not exit(), instead of exec'ing /bin/true.

Craig S 2009-02-27 07:22:40

I tried _exit() as noted; it didn't seem to work properly. That is, I got odd results:in: 25671; square: 3in: 27310; square: -2147485992in: 14472; square: 4in: 1771; square: -2147485976in: 26575; square: 5Sum: -4294971956(Sorry, that's all smushed up, of course. _exit() and wait().

Jonathan Leffler 2009-02-27 07:35:04

On OS X, "true" is in "/usr/bin", hence the error. _exit(0) works as well though.

codelogic 2009-02-27 08:10:54

Ah, it was the end of my day and I never thought to check the location of `true` on OS X.

Craig S 2009-02-27 14:32:21

Answer 3

A:

Things like the anti thread might make this a little easier for you, see the examples (in particular the ns lookup program).

Tim Post 2009-02-27 07:57:39

ansaurus

tags:

views:

answers:

multiple fork() question

related questions