views:

334

answers:

5

I have a a directory with zip archives the zip archives contain .jpg, .png, .gif images I want to unzip each archive taking the images only and putting them in a folder with the name of the archive

so files/archive1.zip files/archive2.zip files/archive3.zip files/archive4.zip

open archive1.zip - take sunflower.jpg, rose_sun.gif make a folder files/archive1/ and add the images to that folder so files/archive1/folder1.jpg, files/archive1/rose_sun.gif

and this to each archive

now i really don't know how this can be done all suggestions are welcome

PS i have over 600 archives and a automatic solution would be a lifesaver, preferably a linux solution

+1  A: 

7zip can do this, and has a Linux version.

mkdir files/archive1
7z e -ofiles/archive1/ files/archive1.zip *.jpg *.png *.gif

(Just tested it, it works.)

MiffTheFox
let me try it right now tx
vache
but than i would have to run this for each zipcan i add a while loops, or something?
vache
+1  A: 

You can write a program using a zip library. If you do Mono, you can use DotNetZip.

The code would look like this:

foreach (var archive in listOfZips)
{
    using (var zip = ZipFile.Read(archive)
    {
        foreach (ZipEntry e in zip)
        {
            if (IsImageFile(e.FileName))
            {
                e.FileName = System.IO.Path.Combine(archive.Replace(".zip",""), 
                                  System.IO.Path.GetFileName(e.FileName));
                e.Extract("files");
            }
        }
    }
}
Cheeso
well i want to keep it on linux, so preferably not .net, but than i should be able to do this same thing lets say using a java zip library no?
vache
Sorry, I don't know if the Java zip libraries have similar function. I mean, I'm sure you could do it, it's a simple matter of programming. But the question is how much programming. When yous ay "I want to keep it on Linux, so preferably not .NET" - are you aware that Mono runs on Linux? In other words, you can use C# and .NET on Linux.
Cheeso
+1  A: 

Something along the lines of:

#!/bin/bash
cd ~/basedir/files
for file in *.zip ; do
    newfile=$(echo "${file}" | sed -e 's/^files.//' -e 's/.zip$//')
    echo ":${newfile}:"
    mkdir tmp
    rm -rf "${newfile}"
    mkdir "${newfile}"
    cp "${newfile}.zip" tmp
    cd tmp
    unzip "${newfile}.zip"
    find . -name '*.jpg' -exec cp {} "../${newfile}" ';'
    find . -name '*.gif' -exec cp {} "../${newfile}" ';'
    cd ..
    rm -rf tmp
done

This is tested and will handle spaces in filenames (both the zip files and the extracted files). You may have collisions if the zip file has the same file name in different directories (you can't avoid this if you're going to flatten the directory structure).

paxdiablo
This would be an excellent solution, except that the temp directory ends up wasting IO and system resources. You should add wildcards to the unzip call. (Add '*.jpg' '*.png' '*.gif' to the end.) Also, you should avoid copying the zip file, and instead use "unzip ../${newfile}/zip".
MiffTheFox
I don't believe that level of efficiency is a real concern here, this looks like a one-shot operation to me (or one that wouldn't be done often enough to warrant over-engineering). The end result is what the OP wanted, the graphic files in a specific directory based on the archive name.
paxdiablo
yeah, this would be a one time thing :), i guess i am testing all these solutions right now, locally, than will try it on a larger amount of zips on the server
vache
ok so the big problem with this is, that the archives have names that have spaces in them, and the code above creates bunch of folders, with text in the zip name being separated by spaces
vache
simply this one too does something similar for file in zip/*.zip ; do newfile=$(echo ${file}) unzip ${newfile} '*.jpg' '*.png' '*.gif'done
vache
but it too breaks when there are spaces in the zip name
vache
It's been fixed to handle spaces now in both the zip files and the zipped files within them.
paxdiablo
wow nice nice, few more tests, but i think its working 100%
vache
this will work since the zips all have unique names, and they are all in one folder
vache
adding the '*.jpg' '*.png' '*.gif' to the end of the unzip callspeeds this up very much, and if you add that, than there is no need for the extension in the copy line no?
vache
That's right but you *will* need a "-type f" on the find to only get files rather than directories (I should have done that already in case you had a directory called dir.jpg). Replace the finds with a single "find . -type f -exec cp {} "../${newfile}" ';'"
paxdiablo
+1  A: 

Perl's Archive-Zip is a good library for zipping/unzipping.

Alan Haggai Alavi
A: 

Here's my take on the first answer...

#!/bin/bash
cd files
for zip_name in *.zip ; do
    dir_name=$(echo "${zip_name}" | sed -e 's/^files.//' -e 's/.zip$//')
    mkdir ${dir_name}
    7z e -o${dir_name}/ ${zip_name} *.jpg *.png *.gif
done

or, if you'd just like to use the regular unzip command...

unzip -d ${dir_name}/ ${zip_name} *.jpg *.png *.gif

I haven't tested this, but it should work... or something along these lines. Definitely more efficient than the first solution. :)

Hope this helps!

Rouben