ansaurus

Question

Python Pipes - What Happens When Reading Output Incrementally

Answer 1

+1 A:

Your assumption is faulty. gunzip does not have to see the entire file to unzip it. Read the unzip file format. There's a directory, with offsets to the individual components.

It's possible to unzip a file in pieces.

"uncompressed file is being stored somewhere in full...right?"

Not necessarily. Not sure why you're assuming it or where you read it.

All low-level I/O calls can block. The write in gunzip -- when writing to a pipe -- can block when the pipe buffer is full. That's the way I/O to a pipe is defined. Pipe I/O blocks.

Check the man pages for pipe for details.

If a process attempts to read from an empty pipe, then read(2) will
block until data is available. If a process attempts to write to a
full pipe (see below), then write(2) blocks until sufficient data has
been read from the pipe to allow the write to complete. Non-blocking
I/O is possible by using the fcntl(2) F_SETFL operation to enable the
O_NONBLOCK open file status flag.

S.Lott 2009-10-23 10:03:38

I have no trouble believing that gunzip can unzip a file incrementally--that's not what I was asking. What I'm wondering is: what stops gunzip from running to completion (and outputting the entire unzipped file's contents) before the Python code in question actually bothers to read lines from that output, provided that gunzip is just another running thread going at its own pace? Or does it block until it can write output?

MarkLu 2009-10-23 10:14:22

Answer 2

+2 A:

This really comes from gunzip implementation, not from python. It is written in C. It probably uses fwrite() from C's stdio.h to write its output.

libc6 implementation I use automatically creates an output buffer, and when it is filled, blocks on fwrite() until it can write more.

nosklo 2009-10-23 10:44:31

ansaurus

tags:

views:

answers:

Python Pipes - What Happens When Reading Output Incrementally

related questions