tags:

views:

6879

answers:

9

Edit: Since it appears that there's either no solution, or I'm doing something so non-standard that nobody knows - I'll revise my question to also ask: What is the best way to accomplish logging when a python app is making a lot of system calls?

My app has two modes. In interactive mode, I want all output to go to the screen as well as to a log file, including output from any system calls. In daemon mode, all output goes to the log. Daemon mode works great using os.dup2(). I can't find a way to "tee" all output to a log in interactive mode, without modifying each and every system call.


In other words, I want the functionality of the command line 'tee' for any output generated by a python app, including system call output.

To clarify:

To redirect all output I do something like this, and it works great:

# open our log file
so = se = open("%s.log" % self.name, 'w', 0)

# re-open stdout without buffering
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)

# redirect stdout and stderr to the log file opened above
os.dup2(so.fileno(), sys.stdout.fileno())
os.dup2(se.fileno(), sys.stderr.fileno())

The nice thing about this is that it requires no special print calls from the rest of the code. The code also runs some shell commands, so it's nice not having to deal with each of their output individually as well.

Simply, I want to do the same, except duplicate instead of redirect.

At first blush, I thought that simply reversing the dup2's should work. Why doesn't it? Here's my test:

import os, sys

### my broken solution:
so = se = open("a.log", 'w', 0)
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)

os.dup2(sys.stdout.fileno(), so.fileno())
os.dup2(sys.stderr.fileno(), se.fileno())
###

print "kljhf sdf"

os.spawnve("P_WAIT", "/bin/ls", ["/bin/ls"], {})
os.execve("/bin/ls", ["/bin/ls"], os.environ)

The file "a.log" should be identical to what was displayed on the screen.

+14  A: 

The print statement will call the write() method of any object you assign to sys.stdout.

I would spin up a small class to write to two places at once...

import sys

class Logger(object):
    def __init__(self):
        self.terminal = sys.stdout
        self.log = open("log.dat", "a")

    def write(self, message):
        self.terminal.write(message)
        self.log.write(message)  

sys.stdout = Logger()

Now the print statement will both echo to the screen and append to your log file:

# prints "1 2" to <stdout> AND log.dat
print "%d %d" % (1,2)

This is obviously quick-and-dirty. Some notes:

  • You probably ought to parametize the log filename.
  • You should probably revert sys.stdout to <stdout> if you won't be logging for the duration of the program.
  • You may want the ability to write to multiple log files at once, or handle different log levels, etc.

These are all straightforward enough that I'm comfortable leaving them as exercises for the reader. The key insight here is that print just calls a "file-like object" that's assigned to sys.stdout.

Triptych
Exactly what I was going to post, pretty much. +1 when you fix the problem with write not having a self argument. Also, it'd be better design to have the file that you're going to write to passed in. Hell, it might also be better design to have stdout passed in.
Devin Jeanpierre
@Devin, yeah this was quick and dirty, I'll make some notes for possible early improvements.
Triptych
much cleaner than the snippet i use, good answer.
John T
John T's snippet brings up a good point though. When the Logger gets deleted you should set stdout back to the default.
tgray
I selected this answer too soon. It works great for "print", but not so much for external command output.
drue
The Logger class should also define a flush() method such as "def flush(): self.terminal.flush(); self.log.flush()"
blokeley
+20  A: 

I had this same issue before and found this snippet very useful:

class Tee(object):
    def __init__(self, name, mode):
        self.file = open(name, mode)
        self.stdout = sys.stdout
        sys.stdout = self
    def __del__(self):
        sys.stdout = self.stdout
        self.file.close()
    def write(self, data):
        self.file.write(data)
        self.stdout.write(data)

from: http://mail.python.org/pipermail/python-list/2007-May/442737.html

John T
+1 for handling the sys.stdout reassignment internally so that you can end logging by deleting the Tee object
Ben Blank
+16  A: 

What you really want is logging module from stanard library. Create a logger and attach two handlers, one would be writing to a file and another in stodout or stderr.

See Logging to multiple destinations for details

Alex Lebedev
+1 for doing it without re-inventing the wheel
wzzrd
+3  A: 

(Ah, just re-read your question and see that this doesn't quite apply.)

Here is a sample program that makes uses the python logging module. This logging module has been in all versions since 2.3. In this sample the logging is configurable by command line options.

In quite mode it will only log to a file, in normal mode it will log to both a file and the console.

import os
import sys
import logging
from optparse import OptionParser

def initialize_logging(options):
    """ Log information based upon users options"""

    logger = logging.getLogger('project')
    formatter = logging.Formatter('%(asctime)s %(levelname)s\t%(message)s')
    level = logging.__dict__.get(options.loglevel.upper(),logging.DEBUG)
    logger.setLevel(level)

    # Output logging information to screen
    if not options.quiet:
        hdlr = logging.StreamHandler(sys.stderr)
        hdlr.setFormatter(formatter)
        logger.addHandler(hdlr)

    # Output logging information to file
    logfile = os.path.join(options.logdir, "project.log")
    if options.clean and os.path.isfile(logfile):
        os.remove(logfile)
    hdlr2 = logging.FileHandler(logfile)
    hdlr2.setFormatter(formatter)
    logger.addHandler(hdlr2)

    return logger

def main(argv=None):
    if argv is None:
        argv = sys.argv[1:]

    # Setup command line options
    parser = OptionParser("usage: %prog [options]")
    parser.add_option("-l", "--logdir", dest="logdir", default=".", help="log DIRECTORY (default ./)")
    parser.add_option("-v", "--loglevel", dest="loglevel", default="debug", help="logging level (debug, info, error)")
    parser.add_option("-q", "--quiet", action="store_true", dest="quiet", help="do not log to console")
    parser.add_option("-c", "--clean", dest="clean", action="store_true", default=False, help="remove old log file")

    # Process command line options
    (options, args) = parser.parse_args(argv)

    # Setup logger format and output locations
    logger = initialize_logging(options)

    # Examples
    logger.error("This is an error message.")
    logger.info("This is an info message.")
    logger.debug("This is a debug message.")

if __name__ == "__main__":
    sys.exit(main())
Atlas1j
Good answer. I saw some really convoluted ways of replicating logging to the console, but making a StreamHandler with stderr was the answer I've been looking for :)
Krougan
+8  A: 

Since you're comfortable spawning external processes from your code, you could use tee itself. I don't know of any Unix system calls that do exactly what tee does.

import subprocess, os, sys

sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)

tee = subprocess.Popen(["tee", "log.txt"], stdin=subprocess.PIPE)
os.dup2(tee.stdin.fileno(), sys.stdout.fileno())
os.dup2(tee.stdin.fileno(), sys.stderr.fileno())

print "\nstdout"
print >>sys.stderr, "stderr"
os.spawnve("P_WAIT", "/bin/ls", ["/bin/ls"], {})
os.execve("/bin/ls", ["/bin/ls"], os.environ)

You could also emulate tee using the multiprocessing package (or use processing if you're using Python 2.5 or earlier).

Jacob Gabrielson
Well, this answer works, so I'll accept it. Still, it makes me *feel* dirty.
drue
This will not work on non-Unix systems, like Windows.
Sorin Sbarnea
This is how much we care about non-Unix systems like Windows:
Matt Joiner
I just posted a pure python implementation of tee (py2/3 compatible) that can run on any platform and also be used in different logging configurations.http://stackoverflow.com/questions/616645/how-do-i-duplicate-sys-stdout-to-a-log-file-in-python/3423392#3423392
Sorin Sbarnea
A: 

As described elsewhere, perhaps the best solution is to use the logging module directly:

import logging

logging.basicConfig(level=logging.DEBUG, filename='mylog.log')
logging.info('this should to write to the log file')

However, there are some (rare) occasions where you really want to redirect stdout. I had this situation when I was extending django's runserver command which uses print: I didn't want to hack the django source but needed the print statements to go to a file.

This is a way of redirecting stdout and stderr away from the shell using the logging module:

import logging, sys

class LogFile(object):
    """File-like object to log text using the `logging` module."""

    def __init__(self, name=None):
        self.logger = logging.getLogger(name)

    def write(self, msg, level=logging.INFO):
        self.logger.log(level, msg)

    def flush(self):
        for handler in self.logger.handlers:
            handler.flush()

logging.basicConfig(level=logging.DEBUG, filename='mylog.log')

# Redirect stdout and stderr
sys.stdout = LogFile('stdout')
sys.stderr = LogFile('stderr')

print 'this should to write to the log file'

You should only use this LogFile implementation if you really cannot use the logging module directly.

blokeley
A: 

another solution using logging module:

import logging
import sys

log = logging.getLogger('stdxxx')

class StreamLogger(object):

    def __init__(self, stream, prefix=''):
        self.stream = stream
        self.prefix = prefix
        self.data = ''

    def write(self, data):
        self.stream.write(data)
        self.stream.flush()

        self.data += data
        tmp = str(self.data)
        if '\x0a' in tmp or '\x0d' in tmp:
            tmp = tmp.rstrip('\x0a\x0d')
            log.info('%s%s' % (self.prefix, tmp))
            self.data = ''


logging.basicConfig(level=logging.INFO,
                    filename='text.log',
                    filemode='a')

sys.stdout = StreamLogger(sys.stdout, '[stdout] ')

print 'test for stdout'
Denis Barmenkov
+4  A: 

I wrote a tee() implementation in Python that should work for most cases, and it works on Windows also.

http://github.com/ssbarnea/tendo/blob/master/tendo/tee.py

Also, you can use it in combination with logging module from Python if you want.

Sorin Sbarnea
A: 

I'm writing a script to run cmd-line scripts. ( Because in some cases, there just is no viable substitute for a Linux command -- such as the case of rsync. )

What I really wanted was to use the default python logging mechanism in every case where it was possible to do so, but to still capture any error when something went wrong that was unanticipated.

This code seems to do the trick. It may not be particularly elegant or efficient ( although it doesn't use string+=string, so at least it doesn't have that particular potential bottle- neck ). I'm posting it in case it gives someone else any useful ideas.

import logging
import os, sys
import datetime

# Get name of module, use as application name
try:
  ME=os.path.split(__file__)[-1].split('.')[0]
except:
  ME='pyExec_'

LOG_IDENTIFIER="uuu___( o O )___uuu "
LOG_IDR_LENGTH=len(LOG_IDENTIFIER)

class PyExec(object):

  # Use this to capture all possible error / output to log
  class SuperTee(object):
      # Original reference: http://mail.python.org/pipermail/python-list/2007-May/442737.html
      def __init__(self, name, mode):
          self.fl = open(name, mode)
          self.fl.write('\n')
          self.stdout = sys.stdout
          self.stdout.write('\n')
          self.stderr = sys.stderr

          sys.stdout = self
          sys.stderr = self

      def __del__(self):
          self.fl.write('\n')
          self.fl.flush()
          sys.stderr = self.stderr
          sys.stdout = self.stdout
          self.fl.close()

      def write(self, data):
          # If the data to write includes the log identifier prefix, then it is already formatted
          if data[0:LOG_IDR_LENGTH]==LOG_IDENTIFIER:
            self.fl.write("%s\n" % data[LOG_IDR_LENGTH:])
            self.stdout.write(data[LOG_IDR_LENGTH:])

          # Otherwise, we can give it a timestamp
          else:

            timestamp=str(datetime.datetime.now())
            if 'Traceback' == data[0:9]:
              data='%s: %s' % (timestamp, data)
              self.fl.write(data)
            else:
              self.fl.write(data)

            self.stdout.write(data)


  def __init__(self, aName, aCmd, logFileName='', outFileName=''):

    # Using name for 'logger' (context?), which is separate from the module or the function
    baseFormatter=logging.Formatter("%(asctime)s \t %(levelname)s \t %(name)s:%(module)s:%(lineno)d \t %(message)s")
    errorFormatter=logging.Formatter(LOG_IDENTIFIER + "%(asctime)s \t %(levelname)s \t %(name)s:%(module)s:%(lineno)d \t %(message)s")

    if logFileName:
      # open passed filename as append
      fl=logging.FileHandler("%s.log" % aName)
    else:
      # otherwise, use log filename as a one-time use file
      fl=logging.FileHandler("%s.log" % aName, 'w')

    fl.setLevel(logging.DEBUG)
    fl.setFormatter(baseFormatter)

    # This will capture stdout and CRITICAL and beyond errors

    if outFileName:
      teeFile=PyExec.SuperTee("%s_out.log" % aName)
    else:
      teeFile=PyExec.SuperTee("%s_out.log" % aName, 'w')

    fl_out=logging.StreamHandler( teeFile )
    fl_out.setLevel(logging.CRITICAL)
    fl_out.setFormatter(errorFormatter)

    # Set up logging
    self.log=logging.getLogger('pyExec_main')
    log=self.log

    log.addHandler(fl)
    log.addHandler(fl_out)

    print "Test print statement."

    log.setLevel(logging.DEBUG)

    log.info("Starting %s", ME)
    log.critical("Critical.")

    # Caught exception
    try:
      raise Exception('Exception test.')
    except Exception,e:
      log.exception(str(e))

    # Uncaught exception
    a=2/0


PyExec('test_pyExec',None)

Obviously, if you're not as subject to whimsy as I am, replace LOG_IDENTIFIER with another string that you're not like to ever see someone write to a log.

cognitiaclaeves