views:

77

answers:

2

As an exercise, I would like to port some of the functionality of a threaded C program (using pthreads) to Python.

The C program spawns a thread to (almost) constantly iterate through certain directories, filling a data structure with their contents; these directories change crazy frequently (it's a mail queue).

The main body of the program reacts to the contents of that data structure in various ways.

From what I've read, Python has problems with concurrency.
Am I going to be able to implement this?
If so, do you have any advice?

I've just recently started using Python regularly, and I'm loving it.

FYI, the source code resembles this:

// big struct for everything
struct context_t {
    struct datastruct_t data;
    pthread_t       my_thread;
    pthread_mutex_t my_mutex;
}

int thread_control;

// does the dirty work
void *thread_func(void *arg) {
    while ( thread_control == TH_RUNNABLE ) {
        // loop through dirs
        // fill up ctx->data
        sleep(1);
    }
    pthread_mutex_unlock ( my_mutex );
    thread_control = TH_STOPPED;
    pthread_exit(NULL);
}

int start_thread(struct context_t* ctx) {
    // get a mutex to control access to our data
    pthread_mutex_trylock(&ctx->my_mutex)
    thread_control = TH_RUNNABLE;
    // start the thread
    pthread_create ( &ctx->my_thread, NULL, thread_func, ctx );
}
int stop_thread() {
    thread_control = TH_STOPRQ;
}
int main() {
    struct context_t *ctx;
    start_thread(ctx);
    // do stuff
}

Thanks!!!

+3  A: 

The "problems with concurrency" with CPython are basically only ones of not being able to use multiple cores from within the same process (you need multiple processes for that) -- apart from that, the threading abilities of Python aren't all that different from C's (though it's easier to port to/from Java's early threading abilities, since that's what the threading module was [[loosely]] based on). Threads will yield control to other threads when they do I/O, or sleep (time.sleep(0) is a popular way to say "yield control if needed, else continue with this thread"), or pre-emptively after a certain number of bytecode instructions since last thread switch.

For the "resource acquisition is initialization" (RAAI) which in C++ (but not in C) you obtain with an appropriate local variable, you can use try/finally, like in Java, or you can use the with statement -- with somelock:, where somelock is an instance of threading.Lock, performs an acquire, executes the statement's body, then ensures a release is performed whether that body terminates normally or by an exception -- I see you don't bother with that in your code, but that does mean that abnormal termination of the thread would leave the lock "held".

Alex Martelli
+2  A: 

Porting the code to python should be straight forward: python's threading module has all the usual suspects when talking about threading (threads, locks, condition variables, etc).

What you've probably heard about python's threading support is the GIL (global interpreter lock). Python only allows one python bytecode operation to be executed at a time, so practically your app will be running "single threaded" (as in only one thread will be running at a time). If you google for it you'll find lots and lots of articles written.

It does not mean your program doesn't need to worry about thread-safety, though: you still need to properly protect shared data structures with mutexes, for example.

vanza
Thanks Vanza; I was concerned about the GIL, but knowing that the only limitation is multicore is very helpful.
threecheeseopera