views:

117

answers:

1

Hi,

I have to run a tool on around 300 directories. Each run take around 1 minute to 30 minute or even more than that. So, I wrote a python script having a loop to run the tool on all directories one after another.

my python script has code something like:

for directory in directories:
  os.popen('runtool_exec ' + directory)

But when I run the python script I get the following error messages repeatedly:

..
tail: write error: Broken pipe
date: write error: Broken pipe
..

All I do is login on a remote server using ssh where the tool, python script, and subject directories are kept. When I individually run the tool from command prompt using command like:

runtool_exec directory

it works fine. "broken pipe" error is coming only when I run using the python script. Any idea, workaround?

Please suggest.

Thanks. Fahim

+4  A: 

The errors you see are because you're using os.popen() to run the command, which means a pipe is opened and connected to the command's stdout. Whenever the command (or anything it executes without redirecting stdout) wants to write to stdout, it'll try to write to the pipe. But you don't keep the pipe around, since you don't assign the result of os.popen() anywhere, so it's cleaned up by Python, and closed. So the processes that try to write to it encounter a broken pipe, and produce the error you see. (os.popen() does not redirect stderr to the pipe, so you still see the errors.)

Another problem in your os.popen() call is that you don't check to make sure directory doesn't contain any characters special to your shell. If the directory contained a quote, for example, or a *, weird things might happen. Using os.popen() like that is pretty bad.

You should use subprocess.Popen() instead, and explicitly tell it what you want to do with the process's output (both stdout and stderr.) You pass subprocess.Popen() a list of arguments (including the thing you want to start) instead of a single string, and it avoids the shell altogether -- no need to sanity-check your strings. If you really want to ignore the output, you would do it with something like:

devnull = open(os.devnull, 'w')
for directory in directories:
    subprocess.Popen(['runtool_exec', directory], stdout=devnull)

although I would strongly recommend at least doing some rudimentary checks on the result of subprocess.Popen to see if the process didn't execute prematurely or with an error code.

Thomas Wouters
Thanks a lot for your answer. This works. But how can I determine which process successfully completes and which got error?
Fahim
and also how to determine which processes are still running? Is there any way to wait for the subprocesses to complete?
Fahim
I am currently trying this:log = open(logFile, 'a');process = subprocess.Popen(args, stdout = log)process.wait()if process.returncode >=0: print 'done !' else: print 'Error ! Process terminated badly'But I see it does not wait for the processes to complete, and returncode is always 99.please help.
Fahim
If it didn't wait for the process to complete, then the returncode would be `None`, not 99. What you describe suggests *the command you start* doesn't wait for any processes *it* starts, and the command you start is what returns 99. The problem is with the process you start, not your use of `subprocess.Popen()`. (Also, if you always want to wait for the process to exit, using `subprocess.call()` is a little easier.)
Thomas Wouters