views:

352

answers:

4

How does Linux determine the next PID it will use for a process? The purpose of this question is to better understand the Linux kernel. Don't be afraid to post kernel source code. If PIDs are allocated sequentially how does Linux fill in the gaps? What happens when it hits the end?

For example if I run a PHP script from Apache that does a <?php print(getmypid());?> the same PID will be printed out for a few minutes while hit refresh. This period of time is a function of how many requests apache is receiving. Even if there is only one client the PID will eventually change.

When the PID changes, it will be a close number, but how close? The number does not appear to be entirely sequential. If I do a ps aux | grep apache I get a fair number of processes:

alt text

How does Linux choose this next number? The previous few PID's are still running, as well as the most recent PID that was printed. How does apache choose to reuse these PIDs?

+7  A: 

Hi "The Rook".

I would rather assume the behavior you watch stems from another source:

Good web servers usually have several process instances to balance the load of the requests. These processes are managed in a pool and assigned to a certain request each time a request comes in. To optimize performance Apache probably assigns the same process to a bunch of sequential requests from the same client. After a certain amount of requests that process is terminated and a new one is created.

I don't believe that more than one processes in sequence are assigned the same PID by linux.

As you say that the new PID is gonna be close to the last one, I guess Linux simply assigns each process the last PID + 1. But there are processes popping up and being terminated all the time in background by applications and system programs, thus you cannot predict the exact number of the apache process being started next.

Apart from this, you should not use any assumption about PID assignment as a base for something you implement. (See also sanmai's comment.)

chiccodoro
I think this is partially correct unfortunately you have no evidence to support this answer.
Rook
Now I do have, see the other answers. :-)
chiccodoro
btw i was the (+1)
Rook
@Rook: If you really need *definitive* proof that PIDs are allocated sequentially, take a look at [alloc_pidmap()](http://lxr.free-electrons.com/source/kernel/pid.c#L125) in the latest Linux kernel tree.
Karmastan
PIDs can be allocated randomly. There's a number of extensions and patches to accomplish that. Don't count on sequential PIDs.
sanmai
+5  A: 

PIDs are sequential. You can see that by starting several processes by yourself on idle machine.

FractalizeR
This does not appear to be true.
Rook
@The Rock: Why?
chiccodoro
@chiccodoro screenshot posted.
Rook
@The Rook it only appears so. If you have process 1234 , maybe next process _you_ create gets 1245. That means some other process was started in the mean time (and since died) - e.g. a new mysql thread got created, some system/cron/whatever process got run, some php page ran 10 external commands etc.. Your screenshot only says that inbetween apache starting some processes, the system started other processes, or maybe you're running apache in multithreaded mode, having some of the threads get the "missing" ids. pid allocation is system wide.
nos
@The Rook: You should review my answer which explains why your numbers are not sequential
chiccodoro
http://stackoverflow.com/questions/822797/about-the-pid-of-the-process
FractalizeR
+1 good link...
Rook
+2  A: 

PIDs can be allocated randomly. There's a number of ways to accomplish that.

sanmai
+3  A: 

The kernel allocates PIDs in the range of (RESERVED_PIDS, PID_MAX_DEFAULT). It does so sequentially in each namespace (tasks in different namespaces can have the same IDs). In case the range is exhausted, pid assignment wraps around.

Some relevant code:

Inside alloc_pid(...)

for (i = ns->level; i >= 0; i--) {
    nr = alloc_pidmap(tmp);
    if (nr < 0)
        goto out_free;
    pid->numbers[i].nr = nr;
    pid->numbers[i].ns = tmp;
    tmp = tmp->parent;
}

alloc_pidmap()

static int alloc_pidmap(struct pid_namespace *pid_ns)
{
        int i, offset, max_scan, pid, last = pid_ns->last_pid;
        struct pidmap *map;

        pid = last + 1;
        if (pid >= pid_max)
                pid = RESERVED_PIDS;
        /* and later on... */
        pid_ns->last_pid = pid;
        return pid;
}

Do note that PIDs in the context of the kernel are more than just int identifiers; the relevant structure can be found in /include/linux/pid.h. Besides the id, it contains a list of tasks with that id, a reference counter and a hashed list node for fast access.

The reason for PIDs not appearing sequential in user space is because kernel scheduling might fork a process in between your process' fork() calls. It's very common, in fact.

Michael Foukarakis
+1 you are a bad ass, so I gave you something else...
Rook