tarfile

How to determine if data is valid tar file

My upload form expects a tar file and I want to check whether the uploaded data is valid. The tarfile module supports is_tarfile(), but expects a filename - I don't want to waste resources writing the file to disk just to check if it is valid. So, is there a way to check the data is a valid tar file without writing to disk, using stand...

Why does tarfile.extractall ignore errors by default?

Python's tarfile module ignores errors during extraction by default, unless errorlevel is set to either 1 or 2 (or debug to 1 if only error messages need to be printed). Try doing a mkdir /tmp/foo && sudo chown root /tmp/foo && chmod a-w /tmp/foo and using tarfile to extract a .tar.gz file over /tmp/foo -- you will see that your Python...

How to create full compressed tar file using Python?

how can I create a .tar.gz file which compress the data as much tar can... ...

tarfile: determine compression of an open tarball

I am on working on a Python script which is supposed to process a tarball and output new one, trying to keep the format of the original. Thus, I am looking for a way to lookup the compression method used in an open tarball to open the new one with same compression. AFAICS TarFile class doesn't provide any public interface to get the nee...

Get python tarfile to skip files without read permission

I'm trying to write a function that backs up a directory with files of different permission to an archive on Windows XP. I'm using the tarfile module to tar the directory. Currently as soon as the program encounters a file that does not have read permissions, it stops giving the error: IOError: [Errno 13] Permission denied: 'path to fi...

Organizing files in tar bz2 file with python

I have about 200,000 text files that are placed in a bz2 file. The issue I have is that when I scan the bz2 file to extract the data I need, it goes extremely slow. It has to look through the entire bz2 file to fine the single file I am looking for. Is there anyway to speed this up? Also, I thought about possibly organizing the files in...

Python tarfile module overwrites existing files during extraction - how to disable it?

Is there a way prevent tarfile.extractall (API) from overwriting existing files? By "prevent" I mean ideally raising an exception when an overwrite is about to happen. The current behavior is to silently overwrite the files. ...