views:

34

answers:

1

Suppose you run the following commands:

ssh $host1 'while [ 1 ]; do sleep 1; echo "Hello from $HOSTNAME"; done' > /tmp/output
ssh $host2 'while [ 1 ]; do sleep 1; echo "Hello from $HOSTNAME"; done' >> /tmp/output
ssh $host3 'while [ 1 ]; do sleep 1; echo "Hello from $HOSTNAME"; done' >> /tmp/output

Then the output would look like:

Hello from host1
Hello from host2
Hello from host3
Hello from host1
...

But what if I changed it to

ssh $host1 'while [ 1 ]; do sleep 1; cat /some/large/file1.txt; done' > /tmp/output
ssh $host2 'while [ 1 ]; do sleep 1; cat /some/large/file2.txt; done' >> /tmp/output
ssh $host3 'while [ 1 ]; do sleep 1; cat /some/large/file3.txt; done' >> /tmp/output

so that stdout from each host won't fit into a single buffer? Would the data integrity of file[1-3].txt, and not the order, be maintained in this case? Is there a possibility that a file fragment of some other file slips in the middle of some other file like this?

[file1_fragment1] [file2_fragment1] [file1_fragment2] [file1_fragment3] [file3_fragment1] ...
+2  A: 

I would say the possibility of that happening is pretty much 100% ;-) assuming the time taken to cat one file over the network is long.

The data will be written to /tmp/output on the local system in approximately the same order that it is received. The shell doesn't know to hold on to data that comes from ssh command #2 or #3 until there's a break in #1, and besides, it will have no idea where the end of each iteration of file 1 comes.

David Zaslavsky
So the data integrity is only maintained up to whatever the host's buffer size (or localhost's, if the buffer size of the localhost is smaller)? Where do you get the buffer size information?
OTZ
No idea, though you could easily run some tests to find out. (Make a file that consists of all A's, another of all B's, another of all C's and use them in your example) It depends on more than just the buffer size, though; sometimes data is flushed whenever a newline is written, so data integrity is only "guaranteed" on a line-by-line basis.
David Zaslavsky
Dang.. I wrote up my experiment showing that "data integrity is not kept in this case", but I clicked a link on the page which eliminated it. Details were interesting though: in particular, the first 4MB (2MB's from one remote; 2MB's from other) were received without any data mixing.
OTZ