I have pretty simple problem. I have a large file that goes through three steps, a decoding step using an external program, some processing in python, and then recoding using another external program. I have been using subprocess.Popen() to try to do this in python rather than forming unix pipes. However, all the data are buffered to memory. Is there a pythonic way of doing this task, or am I best dropping back to a simple python script that reads from stdin and writes to stdout with unix pipes on either side?
Thanks, Sean
Edit: added quick code example below
#!/usr/bin/env python import os import sys import subprocess def main(infile,reflist): print infile,reflist samtoolsin = subprocess.Popen(["samtools","view",infile], stdout=subprocess.PIPE,bufsize=1) samtoolsout = subprocess.Popen(["samtools","import",reflist,"-", infile+".tmp"],stdin=subprocess.PIPE,bufsize=1) for line in samtoolsin.stdout.read(): if(line.startswith("@")): samtoolsout.stdin.write(line) else: linesplit = line.split("\t") if(linesplit[10]=="*"): linesplit[9]="*" samtoolsout.stdin.write("\t".join(linesplit))