tags:

views:

62

answers:

2

Is it possible to programmatically pull a single file from a decently sized .tar.gz without extracting the entire tarball to disk? Essentially I need to get inside large tar.gz files over the network and extract 1 small text file. It seems somewhat over-the-top to pull and extract the tarball to disk, then pull the file out, then delete everything else. Also I'm going to be doing this recursively (e.g. package dependencies, each text file points to more tar.gz's), so the less network traffic and cpu cycles I can get away with, the better.

+3  A: 

From the man page, to extract blah.txt from foo.tar.gz:

tar -xzf foo.tar.gz blah.txt

(And this goes on superuser, of course, but hey, prompt answers are nice too.)

Jefromi
Will this work if the file is located in a sub-directory inside the tar.gz structure?
tj111
With modern versions of GNU tar, you don't have to specify the 'z' option on extract - it will deduce that itself (and similarly with the 'j' option for bzip2 compressed files). When creating the file, you do need the option to tell tar what to create - so symmetry does no harm whatsoever.
Jonathan Leffler
@tj111: if the file name in the tar file is `xyz/pqr/abc.txt`, that is what you specify in the command line. You can use shell-style metacharacters too: `xyz/pqr/*` will extract all the files in all the sub-directories under `xyz/pqr`. Just be wary of whether the shell expands the '*' for you - use quotes to stop it doing so.
Jonathan Leffler
@tj111: Yes, of course. And it will create the directories to hold the single file.
Jefromi
@Jonathan Leffler: Yeah, true; I thought about that, but figured I'd quote the manpage literally.
Jefromi
+1  A: 

I echo Jefromi's answer, with the addition of including the path to the file if you have directories in the tar file (this may seem obvious to some, but it wasn't initially clear to me how to specify the directory structure).

For example, if you did the tar at the src/ directory, and blah.txt was under release1/shared/, you would go back to the src/ directory (if you want it untarred at the same place)

tar -xzf tar.gz release1/shared/blah.txt

If you don't remember the directory structure of your tar file (I'm a little disorganized and sometimes forget where I did the tar), you can always

tar -tzf tar.gz

to see the contents, canceling out (Ctrl+C) once you get an idea of your directory structure.

Chance
Thanks for this. Although it appears you need a `./` in front of the directories for whatever reason to scan into them. e.g. `tar -xzf tar.gz ./release1/shared/blah.txt`
tj111