tags:

views:

1029

answers:

1

I have a bunch of tar files in a directory and I want to extract all the files from them at once. But this doesn't seem to do anything:

$ tar xf *.tar

What's going on here? How do I untar a bunch of files at once?

+9  A: 

What's going on here?

Originally, the tar command was intended for use with magnetic tape devices. Since it only made sense to execute tar on one device at a time, the syntax was designed to assume one and only one device. The first file or directory passed was assumed to be the device that held the archive in question and any other files or directories where the contents of the archive to be included in the operation. So for tar extraction (the x option), the first file passed would be the archive and all other files would be the files to be extracted. So if there are two *.tar files (say a.tar and b.tar) your command would expand to:

$ tar xf a.tar b.tar

Unless a.tar contains a file named b.tar, the tar command has nothing to do and exits quietly. Annoyingly, the Solaris version of tar does not report any problems either in the return code or with the verbose option (v). Meanwhile, GNU tar returns 2 and spams STDERR even with the verbose option off:

tar: b.tar: Not found in archive
tar: Exiting with failure status due to previous errors

How do I untar a bunch of files at once?

It's too late rewrite tar to accept multiple archive files as input, but it's not too hard to work around the limitation.

For most people, running tar multiple times for multiple archives is the most expedient option. Passing just one filename to tar xf will extract all the archived files as one would expect. One approach is to use a shell for loop:

$ for f in *.tar; do tar xf $f; done

Another method is to use xargs:

$ ls *.tar | xargs -i tar xf {}

Alternatively, you can use one of a number of alternative tar file readers. Finally, the truly dedicated programmer could easily write an tar replacement that works exactly as desired. The format is straightforward and many programming languages have libraries available to read tar files. If you are a Perl programmer, for instance, take a look at the Archive::Tar module.

A warning

Blindly untarring a bunch of files can cause unexpected problems. The most obvious is that a particular file name may be included in more than one tar file. Since tar overwrites files by default, the exact version of the file you end up with will depend on the order the archives are processed. More troubling, you may end up with a corrupted copy of the file if you try this "clever" optimization:

for f in *.tar; do
  tar xf $f &
done
wait

If both a.tar and b.tar contain the same file and try to extract it at the same time, the results are unpredictable.

A related issue, especially when taking archives from an untrusted source, is the possibility of a tarbomb.

One partial solution would be to automatically create a new directory to extract into:

for f in *.tar; do 
  d=`basename $f .tar`
  mkdir $d
  (cd $d && tar xf ../$f)
done

This won't help if a file is specified in the archive with an absolute path (which is normally a sign of malicious intent). Adding that sort of check is left as an exercise for the reader.

Jon Ericson
My tar man page says "The -l and -i options appear in the 1997 version of the POSIX standard, but do not appear in the 2004 version of the standard. Therefore you should use -L and -I instead, respectively."For this task, I'd be careful with xargs and use "-n1" to avoid the original problem.
slacy
You'd need to be very careful that the paths of the files contained in the tar files don't overlap or you may have race conditions with regard to which untar wins -- or even some extract errors if more than one untar tries to write the same file at the same time.
tvanfosson
Well there can't be a race condition, since the tar commands are run serially. But of course having multiple archives holding files named the same can cause problems.
Jon Ericson
@slacy: I think you mean the GNU xargs manpage. I suppose you are right, but I really don't like the way that would turn out: "xargs -I {} tar xf {}". On the other hand, "-n1" works fine in this case: "| xargs -n1 tar tf". It's even an improvement. But that's a new trick to this old dog. ;-)
Jon Ericson
I removed the suggestion `cat *.tar | tar tf -`. It turns out to not work as it only untars the first archive. I was fooled by the commonly used `gunzip -c | tar xf -` idiom. Sorry about that!
Jon Ericson