views:

150

answers:

5

All thread create methods like pthread_create() or CreateThread() in Windows expect the caller to provide a pointer to the arg for the thread. Isn't this inherently unsafe?

This can work 'safely' only if the arg is in the heap, and then again creating a heap variable adds to the overhead of cleaning the allocated memory up. If a stack variable is provided as the arg then the result is at best unpredictable.

This looks like a half-cooked solution to me, or am I missing some subtle aspect of the APIs?

+1  A: 

Allocation on the heap does not add a lot of overhead.

Besides the heap and the stack, global variable space is another option. Also, it's possible to use a stack frame that will last as long as the child thread. Consider, for example, local variables of main.

I favor putting the arguments to the thread in the same structure as the pthread_t object itself. So wherever you put the pthread record, put its arguments as well. Problem solved :v) .

Potatoswatter
That's exactly my concern, the stack variable pointers should not be allowed as arg for thread Create apis as their lifetime can not be guaranteed to exceed or even overlap the thread-lifetime. So why accept a pointer input ? Why not just let the programmer use shared global variables?
s_b
@s_b: How do you know the stack variables aren't guaranteed to outlive their use by the child thread? A programmer could have the parent thread wait for the child thread to start before it will return from the stack frame in which it created the child...
Borealid
The library doesn't stop you from using shared global variables. The library attempts to provide you with the most flexibility. And how would the API know if you don't pass in the variable? How would it modify the value if it doesn't have a reference to this variable? Passing a pointer to a structure is lighter than copying objects around (also, structures can be extended should need be.)
dirkgently
@s_b: look at it from another point of view - the thread's point of view. How does it know which global variable it should use if the parent thread cannot tell it which one to use?
Jonathan Leffler
@s_b, Don't forget that it is possible that a single thread function is used to create many threads. A pool of worker threads would work this way. The `void *` lets you tell each worker some fact that helps it act differently from its siblings.
RBerteig
+5  A: 

Context.

Many C APIs provide an extra void * argument so that you can pass context through third party APIs. Typically you might pack some information into a struct and point this variable at the struct, so that when the thread initializes and begins executing it has more information than the particular function that its started with. There's no necessity to keep this information at the location given. For instance you might have several fields that tell the newly created thread what it will be working on, and where it can find the data it will need. Furthermore there's no requirement that the void * actually be used as a pointer, its a typeless argument with the most appropriate width on a given architecture (pointer width), that anything can be made available to the new thread. For instance you might pass an int directly if sizeof(int) <= sizeof(void *): (void *)3.

As a related example of this style: A FUSE filesystem I'm currently working on starts by opening a filesystem instance, say struct MyFS. When running FUSE in multithreaded mode, threads arrive onto a series of FUSE-defined calls for handling open, read, stat, etc. Naturally these can have no advance knowledge of the actual specifics of my filesystem, so this is passed in the fuse_main function void * argument intended for this purpose. struct MyFS *blah = myfs_init(); fuse_main(..., blah);. Now when the threads arrive at the FUSE calls mentioned above, the void * received is converted back into struct MyFS * so that the call can be handled within the context of the intended MyFS instance.

Matt Joiner
nicely put, thanks
s_b
A: 

This is a common idiom in all C programs that use function pointers, not just for creating threads.

Think about it. Suppose your function void f(void (*fn)()) simply calls into another function. There's very little you can actually do with that. Typically a function pointer has to operate on some data. Passing in that data as a parameter is a clean way to accomplish this, without, say, the use of global variables. Since the function f() doesn't know what the purpose of that data might be, it uses the ever-generic void * parameter, and relies on you the programmer to make sense of it.

If you're more comfortable with thinking in terms of object-oriented programming, you can also think of it like calling a method on a class. In this analogy, the function pointer is the method and the extra void * parameter is the equivalent of what C++ would call the this pointer: it provides you some instance variables to operate on.

asveikau
A: 

The pointer is a pointer to the data that you intend to use in the function. Windows style APIs require that you give them a static or global function.

Often this is a pointer to the class you are intending to use a pointer to this or pThis if you will and the intention is that you will delete the pThis after the ending of the thread.

Its a very procedural approach, however it has a very big advantage which is often overlooked, the CreateThread C style API is binary compatible so that when you wrap this API with a C++ class (or almost any other language) you can do this actually do this. If the parameter was typed, you wouldn't be able to access this from another language as easily.

So yes, this is unsafe but there's a good reason for it.

Rick
and use one of the friendlier alternatives like std::thread, boost::thread, TBB or the Concurrency Runtime if you are on Windows.
Rick
+2  A: 

Isn't this inherently unsafe?

No. It is a pointer. Since you (as the developer) have created both the function that will be executed by the thread and the argument that will be passed to the thread you are in full control. Remember this is a C API (not a C++ one) so it is as safe as you can get.

This can work 'safely' only if the arg is in the heap,

No. It is safe as long as its lifespan in the parent thread is as long as the lifetime that it can be used in the child thread. There are many ways to make sure that it lives long enough.

and then again creating a heap variable adds to the overhead of cleaning the allocated memory up.

Seriously. That's an argument? Since this is basically how it is done for all threads unless you are passing something much more simple like an integer (see below).

If a stack variable is provided as the arg then the result is at best unpredictable.

Its as predictable as you (the developer) make it. You created both the thread and the argument. It is your responsibility to make sure that the lifetime of the argument is appropriate. Nobody said it would be easy.

This looks like a half-cooked solution to me, or am i missing some subtle aspects of the APIs?

You are missing that this is the most basic of threading API. It is designed to be as flexible as possible so that safer systems can be developed with as few strings as possible. So we now hove boost::threads which if I guess is build on-top of these basic threading facilities but provide a much safer and easier to use infrastructure (but at some extra cost).

If you want RAW unfettered speed and flexibility use the C API (with some danger). If you want a slightly safer use a higher level API like boost:thread (but slightly more costly)

Thread specific storage with no dynamic allocation (Example)

#include <pthread.h>
#include <iostream>

struct ThreadData
{
    // Stuff for my thread.
};

ThreadData  threadData[5];

extern "C" void* threadStart(void* data);

void* threadStart(void* data)
{
    intptr_t        id      = reinterpret_cast<intptr_t>(data);
    ThreadData&     tData   = threadData[id];

    // Do Stuff
    return NULL;
}


int main()
{
    for(intptr_t loop = 0;loop < 5; ++loop)
    {
        pthread_t   threadInfo; // Not good just makes the example quick to write.

        pthread_create(&threadInfo, NULL, threadStart, reinterpret_cast<void*>(loop));
    }
    // You should wait here for threads to finish before exiting.
}
Martin York