views:

107

answers:

1

I have written a simple program using parallel python, and all works well. However, mainly for curiosities sake, I would like to know on which machine each task ran, and how long it took.

Is there any way to programmatically get this information for the job that is returned?

+1  A: 

A uuid1 could help:

>>> import uuid
>>> uuid.uuid1()
UUID('b46fa8cf-1fc1-11df-b891-001641ec3fab')
>>>

See pydoc uuid and the RFC 4122 for more details, I think the last 48 bits are unique to the host. Not sure you you call/return that in Parallel python though.

In the pp.py I found:

self.__stats[hostid] = _Statistics(ncpus, rworker)

Can you then use get_stats() to get at that:

get_stats(self) Returns job execution statistics as a dictionary.

davey
Unfortunately, I am not sure that this helps. What I want to do, is given a given job find out where it ran. I could do this by modifying the code that is called, which is what I guess you are suggesting, but that is not really what I wanted to do. I was hoping to get the information directly from the returned job in some fashion.
Paul Wagland
Just as a general comment, when you do an edit, it is normally "polite" to indicate that you have added more information, particularly if it is different in content, rather than just expanded content. Thanks for the new tip, I will try it out tonight.
Paul Wagland
Sorry, I'd added it as a comment first then wanted to put in the formatted code so I edited the Q and abandoned the comment.
davey
Davey, updating the answer is definitely the correct thing to do! No need to apologize for that ;-) I was just saying that when you do update the answer, particularly in response to a comment, it is common to put in <b>Edit:</b> before the bit that you are adding.
Paul Wagland