I'm controlling long running simulations (hours, days, even weeks) using a bash script that iterates over all wanted parameters. If only one simulation runs concurrently, the output is piped to "tee", else the output is plainly piped ">" to an output file. All output are huge: some log files are ~2GB and could be even bigger.
The script is working, but is a hell to maintain. When we add a new parameter it takes some time to adapt the script and all the sed-foo in it. So I've ported it to Python. It's working GREAT.
The only problem I have now preventing me from using it in production is that I can't find the right way of calling Popen() to launch the program. If I run it "silent" by piping everything to the file and not showing any output, python takes gigabytes of ram before the simulation is done.
Here's the code snipet:
fh = open(logfile, "w")
pid = subprocess.Popen(shlex.split(command), stdout=fh)
pids.append(pid)
I've read a lot of stuff about Popen the output, but I though that piping it to a file would flush the buffer when needed?
Maybe subprocess' Popen() is not the best for this? What's the best way to show and save a program's output to screen and file without taking all the ram?
Thanx!