views:

1619

answers:

4

I have a python script which needs to execute several command line utilities. The stdout output is sometimes used for further processing. In all cases, I want to log the results and raise an exception if an error is detected. I use the following function to achieve this:

def execute(cmd, logsink):
    logsink.log("executing: %s\n" % cmd)
    popen_obj = subprocess.Popen(\
          cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    (stdout, stderr) = popen_obj.communicate()
    returncode = popen_obj.returncode
    if (returncode <> 0):
       logsink.log("   RETURN CODE: %s\n" % str(returncode))
    if (len(stdout.strip()) > 0):
       logsink.log("   STDOUT:\n%s\n" % stdout)
    if (len(stderr.strip()) > 0):
       logsink.log("   STDERR:\n%s\n" % stderr)
    if (returncode <> 0):
       raise Exception, "execute failed with error output:\n%s" % stderr
    return stdout

"logsink" can be any python object with a log method. I typically use this to forward the logging data to a specific file, or echo it to the console, or both, or something else...

This works pretty good, except for three problems where I need more fine-grained control than the communicate() method provides:

  1. stdout and stderr output can be interleaved on the console, but the above function logs them separately. This can complicate the interpretation of the log. How do I log stdout and stderr lines interleaved, in the same order as they were output?
  2. The above function will only log the command output once the command has completed. This complicates diagnosis of issues when commands get stuck in an infinite loop or take a very long time for some other reason. How do I get the log in real-time, while the command is still executing?
  3. If the logs are large, it can get hard to interpret which command generated which output. Is there a way to prefix each line with something (e.g. the first word of the cmd string followed by :).
+5  A: 

You can redirect to a file if you just want the output in a file for later evaluation.

Your already defining the stdout/stderr of the processes your executuing by the stdout=/stderr= methods.

In your example code your just redirecting to the scripts current out/err assigments.

subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

sys.stdout and sys.stderr are just file-like objects. As the documentation documentation on sys.stdout mentions, "Any object is acceptable as long as it has a write() method that takes a string argument."

f = open('cmd_fileoutput.txt', 'w')
subprocess.Popen(cmd, shell=True, stdout=f, stderr=f)

So you only need to given it a class with a write method in order to re-direct output.

If you want both console output and file output may be making a class to manage the output.

General redirection:

# Redirecting stdout and stderr to a file
f = open('log.txt', 'w')
sys.stdout = f
sys.stderr = f

Making a redirection class:

# redirecting to both
class OutputManager:
    def __init__(self, filename, console):
        self.f = open(filename, 'w')
        self.con = console

    def write(self, data):
        self.con.write(data)
        self.f.write(data)

new_stdout = OutputManager("log.txt", sys.stdout)

Interleaving is dependant on buffering, so you may or may not get the output you expect. (You can probably turn off or reduce the buffering used, but I don't remember how at the moment)

monkut
os.write(self.f.fileno(), data) is unbuffered if you need to be sure
Ali A
+1 I had not realized that python duck typing could be leveraged here by providing my own File object implementation. I'll give it a go and will post an updated "execute" function if it works.
Wim Coenen
wconenen - Yes, this is a great feature of python, you can easily replace interfaces by imitation.
monkut
I've just tried it, and got "AttributeError: OutputManager instance has no attribute 'fileno'". The file object documentation says this attribute should not be implemented unless it is a real file . Ergo, it seems the subprocess module is not happy with anything other than a real file.
Wim Coenen
I assume your using the f.fileno() buffer zero method mentioned by Ali A. If you don't write that method it won't be available, only use that method when your writting to and acutal file.
monkut
@monkut: No, I am using your OutputManager class as the stdout argument in subprocess.Popen. I don't call fileno() myself.
Wim Coenen
shouldn't that be def write(self, data): ...
Jamie
+2  A: 

You can look into pexpect (http://www.noah.org/wiki/Pexpect)

It solves 1) and 2) out of the box, prefixing the output might be a little trickier.

SvenAron
+1 for doing most of the job without reinventing the wheel. I guess the prefixing can be done by replacing newlines in the output of the .before function.
Wim Coenen
I just tried it, but it turns out to be UNIX only. An error is thrown when it tries to load the standard "resource" module which is not available on windows.
Wim Coenen
Bummer, I only ever use python on UNIX systems so I didn't consider portability.
SvenAron
+1  A: 

One other option:

def run_test(test_cmd):
    with tempfile.TemporaryFile() as cmd_out:
        proc = subprocess.Popen(test_cmd, stdout=cmd_out, stderr=cmd_out)
        proc.wait()
        cmd_out.seek(0)
        output = "".join(cmd_out.readlines())
    return (proc.returncode, output)

This will interleave stdout and stderr as desired, in a real file that is conveniently open for you.

rescdsk
+1  A: 

This is by no means a complete or exhaustive answer, but perhaps you should look into the Fabric module.

http://docs.fabfile.org/0.9.1/

Makes parallel execution of shell commands and error handling rather easy.

gatlin