views:

76

answers:

3

I think this might be a fairly easy question.

I found a lot of examples using threads and shared variables but in no example a shared variable was created inside a thread. I want to make sure I don't do something that seems to work and will break some time in the future.

The reason I need this is I have a shared hash that maps keys to array refs. Those refs are created/filled by one thread and read/modified by another (proper synchronization is assumed). In order to store those array refs I have to make them shared too. Otherwise I get the error Invalid value for shared scalar.

Following is an example:

my %hash :shared;

my $t1 = threads->create(
    sub { my @ar :shared = (1,2,3); $hash{foo} = \@ar });
$t1->join;

my $t2 = threads->create(
    sub { print Dumper(\%hash) });
$t2->join;

This works as expected: The second thread sees the changes the first made. But does this really hold under all circumstances?


Some clarifications (regarding Ian's answer):

I have one thread A reading from a pipe and waiting for input. If there is any, thread A will write this input in a shared hash (it maps scalars to hashes... those are the hashes that need to be declared shared as well) and continues to listen on the pipe. Another thread B gets notified (via cond_wait/cond_signal) when there is something to do, works on the stuff in the shared hash and deletes the appropriate entries upon completion. Meanwhile A can add new stuff to the hash.

So regarding Ian's question

[...] Hence most people create all their shared variables before starting any sub-threads.

Therefore even if shared variables can be created in a thread, how useful would it be?

The shared hash is a dynamically growing and shrinking data structure that represents scheduled work that hasn't yet been worked on. Therefore it makes no sense to create the complete data structure at the start of the program.

Also the program has to be in (at least) two threads because reading from the pipe blocks of course. Furthermore I don't see any way to make this happen without sharing variables.

+3  A: 

The reason for a shared variable is to share. Therefore it is likely that you will wish to have more than one thread access the variable.

If you create your shared variable in a sub-thread, how will you stop other threads accessing it before it has been created? Hence most people create all their shared variables before starting any sub-threads.

Therefore even if shared variables can be created in a thread, how useful would it be? (PS, I don’t know if there is anything in perl that prevents shared variables being created in a thread.)


PS A good design will lead to very few (if any) shared variables

Ian Ringrose
1. You can synchronize with `lock`, `cond_wait` and `cond_signal`. 2. I can't know in advance how many keys and values my shared hash will have.
musiKk
+2  A: 

This task seems like a good fit for the core module Thread::Queue. You would create the queue before starting your threads, push items on with the reader, and pop them off with the processing thread. You can use the blocking dequeue method to have the processing thread wait for input, avoiding the need for signals.

Eric Strom
Thanks, I haven't thought of that. But the main problem can't be solved. The items that are created are no primitive data structures (so I still have to create shared variables in the thread) and the tasks are to be worked on by priority, not FIFO. While this is possible with `extract` and `insert` I'd have to `lock` the queue and then this is no better then the hash and I'm right at the beginning.
musiKk
You could have the processing thread pull off items immediately, and then manage its own priority queue. You do not need to `share` data structures you push onto the queue explicitly, the queue will handle all of that for you. If the data structures you are moving are very large, I find the fastest way to move them between threads is to serialize the data into a string (via pack or data dumper or ...), and then unpack it at the destination.
Eric Strom
That's a good idea; I'll reconsider. I don't want to be nitpicking but I think this still doesn't really answer my question. It's an interesting alternative nonetheless.
musiKk
Ok, a queue needs shared variables if they are not scalar. At least in Perl 5.10. Future versions do it themselves by calling `shared_clone` which is also only available in newer versions. This makes me believe that it is no problem to create shared variables because that's exactly what `enqueue` in newer versions does.
musiKk
A: 

I don't feel good answering my own question but I think the answers so far don't really answer it. If something better comes along, I'd be happy to accept that. Eric's answer helped though.

I now think there is no problem with sharing variables inside threads. The reasoning is: Threads::Queue's enqueue() method shares anthing it enqueues. It does so with shared_clone. Since enqueuing should be good from any thread, sharing should too.

musiKk