tags:

views:

743

answers:

5

I want to capture all the progress messages emitted by an rsync process from a Perl script. In certain circumstances this isn't working.

Here's a typical rsync command line I use:

rsync -aL --verbose --progress --bwlimit=100 \
  --include-from=/tmp/78hJ2eDCs1 \
  --include '*/' --exclude '*' \
  /srcdir/* \
  hostname:/target/ 2>&1

If I run this within a bash shell, I'll see something like this:

Building file list ...
1600 files...
1700 files...
and so on

If I try the same command within Perl, I get the "Building file list" output OK, but not the status updates. Here's how I test the capture

my $pid = open(OUTPUT, "$cmd |")  or die "Couldn't fork: $!\n";

my $ch;
while(read(OUTPUT, $ch, 1)==1)
{
    print $ch;
}
close(OUTPUT);

My guess is that either rsync senses the output handle isn't a typical console, or is being output in some unusual manner that I'm not capturing. However, what makes it even odder is that if I omit the --include and --exclude filters, I can capture the status messages just fine.

Anyone any clues as to what is going on?

+4  A: 

Does perl buffer the output from pipes? If so, you might be able to get it working if after you open the OUTPUT handle, you turn off buffering with OUTPUT->autoflush(1);

Paul Tomblin
I tried that to no avail - the odd thing is that I *can* capture all the output if I don't use the --include/exclude options.
Paul Dixon
+4  A: 

you could use Expect.pm, which mimics a PTY, that may give you the output you are looking for.

Failing that, you could try the --stats or --progress options.

dsm
Not sure I can just iteratively eat the output with Expect, but the idea of mimicking a pseudo TTY might prove fruitful...will explore that.
Paul Dixon
Nope, I tried using IO::Pty::Easy and I saw the exact same behaviour
Paul Dixon
+2  A: 

Turns out the solution was simple - I just had to unbuffer the IO in my script with $| = 1;

I'm still puzzled by how I observed the problem with some rsync options and not others. Thankyou Paul Tomblin and dsm for giving me ideas.

Paul Dixon
Funny, I thought $| and ->autoflush were equivalent, that's why I suggested it.
Paul Tomblin
I guess that autoflush acted on the OUTPUT handle, while $| was acting on the scripts STDOUT? i.e. somehow I just wasn't getting the output flushed?
Paul Dixon
A: 

You might also want to use IPC::Run and it's callbacks for data on stdout/stderr.

depesz
A: 

Suffering from Buffering? is best explanation about buffering effect I found so far. Well worth the time to read and understand.

Don't forget that buffering is there for a reason so don't just blinding turn it off for every possible problem.

dpavlin