views:

187

answers:

7

There is a set of N real-time data-independent tasks. Each of the tasks processes some digital data stream. A data stream for each task comes from an input port and a resulted stream then is being directed to an output port.

1) What is computationally less intensive: tasks in form of the processes or threads?
2) Does best choice depends on the number of physical CPU cores available?

Thank you in advance!

+3  A: 

Threads should have slightly less overhead since they share the same memory space within the process, however this overhead may be small enough to not impact your program. The only way to know for certain which will be better for you is to try both a measure how long each takes.

tloach
They share the same memory space but operating system places threads of a process so, that they all run on only one physical CPU. It means, that in multi-CPU configuration some CPUs will be stay idle. But I am not sure, maybe modern Linux NUMA provides better load balance.
psihodelia
If the streams are independent, then the threads have the overhead of coordinating resources within a single process that the o/s takes care of when you use separate processes.
Jonathan Leffler
A: 

Threads.

It simplifies many types of overhead, such as deployment. It is more future-proof, because at some point you may want them to interact (write to a common log, for example, or have a common management console).

Pavel Radzivilovsky
Threads also allow more subtle bugs, particularly if you want them to interact.
David Thornley
+2  A: 

There's no one answer to this question. Just to give two examples, Linux tends to use separate processes quite a bit. The kernel developers appear to have put considerable effort into optimizing process switching. Threads were added on quite a bit later, and don't seem to have received nearly as much attention. As a result, under Linux separate processes have a fairly low cost, and using threads instead doesn't save a huge amount.

Current implementations of Windows, by contrast, go back to Windows NT. Windows NT was based fairly closely on OS/2, which had threads since day one. Process switching does not seem to have ever been optimized to the same degree as under Linux. As you might expect from that background, under the Windows the difference between a process switch and a thread switch is much larger. Consequently, you gain considerably more by writing code in multiple threads instead of multiple processes.

Jerry Coffin
Can you please prove your statements? Maybe a web-link?
psihodelia
Which statements do you think need proving? Benchmarks of thread switching speed are generally on different sites than those covering the heritage of the Windows product line.
Jerry Coffin
For a discussion of threads vs processes, read 'The Art of Unix Programming' by E S Raymond.
Jonathan Leffler
@Jonathon Leffler:That certainly has a discussion on the subject. OTOH, as sources of factual information go, I'd rank Eric S. Raymond's writing about even with "something I'm sure I heard somebody say around 1 AM the last time I went bar-hopping..."
Jerry Coffin
A: 

1) What is computationally less intensive: tasks in form of the processes or threads?

Usually intensity is same. At VERY EXOTIC conditions threads may be less intensive.

2) Does best choice depends on the number of physical CPU cores available?

Two or more threads witch run on different cpu cores are MUCH MORE SLOWLY then single core system. It's issues of super-scalar architectures.

Processes are best choice in 99.9%.

Also threading environment is hard to debug.

vitaly.v.ch
<i>Two or more threads witch run on different cpu cores</i> - maybe you mean two different physical CPU ?
psihodelia
no, on x86 architecture it's like identical.
vitaly.v.ch
+1  A: 

if your work units are CPU bound, it's best to use processes because typically all threads in a process are running on the same core (at least in Linux). using a process per CPU core will ensure each process gets an actual CPU core.

The advantage of threads is that it's very easy to share state between them, so if all you really need is to download a file without blocking, or sending a query to a database, threads will be a better option.

in fact, I am just in the middle of writing something that you may find useful: a manager that spawns a few worker processes and restart them if they exit/killed/dies. it's pretty much ready, so here it is:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#include <string>

void termination_handler(int signum)
{
    printf("pid %d : signal %d caught in pid %d\n", getpid(), signum, getpid());
    kill(0, SIGTERM);
    exit(1);
}

int main(int argc ,char** argv) 
{
    if (argc > 1)
    {
     printf("pid %d : Worker code running\n", getpid());
     sleep(10);
    }
    else
    {
     printf("pid %d : manager started\n", getpid());
     // manager
     const int MAX_INSTANCES = 3;
     int numInstances = 0;

     struct sigaction new_action;
     /* Set up the structure to specify the new action. */
     new_action.sa_handler = termination_handler;
     sigemptyset(&new_action.sa_mask);
     new_action.sa_flags = 0;
     sigaction(SIGKILL, &new_action, NULL);
     sigaction(SIGTERM, &new_action, NULL);
     sigaction(SIGINT, &new_action, NULL);

     int status;
     int w;
     do
     {
      while (numInstances < MAX_INSTANCES)
      {
       int pid = fork();
       if (pid < 0)
       {
        printf("fork failed\n");
        exit(1);
       }
       else
       {
        if (pid == 0)
        {
         char * const argv1[] = { (char*) argv[0], (char*)"worker",(char *) 0 };
         char * const envp1[] = { (char *) 0 };
         std::string prog = argv[0];
         execve(prog.c_str(), argv1, envp1);
        }
        else
        {
         numInstances++;
        }
       }
      }   

      w = waitpid(0, &status, WUNTRACED | WCONTINUED);
      if (w == -1)
      {
       perror("waitpid");
       exit(EXIT_FAILURE);
      }

      if (WIFEXITED(status))
      {
       printf("pid %d : child %d exited, status=%d\n",getpid(), w, WEXITSTATUS(status));
       numInstances--;
      }
      else if (WIFSIGNALED(status))
      {
       printf("pid %d : child %d killed by signal %d\n",getpid(), w, WTERMSIG(status));
       numInstances--;
      }
      else if (WIFSTOPPED(status))
      {
       printf("pid %d : child %d stopped by signal %d\n",getpid(), w, WSTOPSIG(status));
      }
      else if (WIFCONTINUED(status))
      {
       printf("pid %d : child %d continued\n", getpid(), w);
      }
     } 
     while (true);
     //! WIFEXITED(status) && !WIFSIGNALED(status));

     printf("pid %d : manager terminated\n", getpid());
    }
    return 0;
}
Omry
The linux comment is antiquated. This was true only prior to 2.5/2.6 kernels (late 2003, early 2004). NPTL provides a 1-to-1 relationship between threads and tasks in the scheduler. Also, it is not simply a matter of whether or not the application is CPU bound; it is a balance between being CPU bound and the coordination that must happen between the independently running pieces of code.
charstar
+2  A: 

If you don't actually need to share large amounts of state between the tasks, then processes are likely to be the better option. The tighter coupling between the tasks implied by threading will make it much harder to give real-time guarantees - locking makes it much harder to prove real-time behaviour.

In short, processes should be your default option; switch to threads only if you have identified a definite reason to do so.

caf
+1  A: 

I agree with the answers that say there is no "right" answer. It very much depends on the application requirements. My personal choice would be threads if there is no obvious choice.

But I wondered about the efficiency aspect and so wrote a really simple stupid test before going home last night. The "application" just counted the number of primes using simple division checks. So it was purely CPU bound with no contention for other shared resources. Thus, it really only focused on the cost of context switches. Running 64 instances (threads/processes) on a quad core 1.86GHz Xeon showed no difference. Each task counted the primes up through 10,000,000. The total time was 333 seconds in both cases.

As I bicycled home, it occurred to me that since there was no resource contention, the context switches would be minimal; every thread/process would run for its full time slice. So I artificially forced context switches every 100 iterations using Sleep(0), which I believe did what I intended (Win32). After that, the same test took 343 seconds for the thread version and 350 seconds for the process version. So the thread version showed about a 1.7% speedup. Not really anything to write home about.

Mark Wilkins