tags:

views:

1669

answers:

10

So, there seems to be a few questions asking about removing files/directories matching certain cases, but I'm looking for the exact opposite: Delete EVERYTHING in a folder that DOESN'T match my provided examples.

For example, here is an example directory tree:

.
|-- coke
|   |-- diet
|   |-- regular
|   `-- vanilla
|-- icecream
|   |-- chocolate
|   |-- cookiedough
|   |-- cupcake
|   |   |-- file1.txt
|   |   |-- file2.txt
|   |   |-- file3.txt
|   |   |-- file4.txt
|   |   `-- file5.txt
|   `-- vanilla
|-- lol.txt
|-- mtndew
|   |-- classic
|   |-- codered
|   |-- livewire
|   |   |-- file1.txt
|   |   |-- file2.txt
|   |   |-- file3.txt
|   |   |-- file4.txt
|   |   `-- file5.txt
|   `-- throwback
`-- pepsi
    |-- blue
    |-- classic
    |-- diet
    `-- throwback

I want to delete everything but the files in test/icecream/cupcake/ and test/mtndew/livewire/. Everything else can go, including the directory structure. So, how can I achieve this? Languages I wouldn't mind this being in: bash or python.

A: 

use find.

Your command will look something like:

find $directory \( -prune 'some pattern' \) -delete
dsm
Tried this, and it says the following:find /home/phuzion/test/ \( -prune 'icecream/cupcake/' \) -deletefind: paths must precede expression
phuzion
From the man page for find: "Because -delete implies -depth, you cannot usefully use -prune and -delete together."
Dennis Williamson
The correct syntax would be more like "find $directory -path ./pepsi/diet -prune -o -exec some-command '{}' \;" anyway to eliminate that error message.
Dennis Williamson
+2  A: 

You could do something based on Python's os.walk function:

import os
for root, dirs, files in os.walk(top, topdown=False):
    for name in files:
        os.remove(os.path.join(root, name))
    for name in dirs:
        os.rmdir(os.path.join(root, name))

...just add something to ignore the paths you're interested in.

Jon Cage
+3  A: 

Everything "except" is why we have if-statements; and why os.walk's list of directories is a mutable list.

for path, dirs, files in os.walk( 'root' ):
    if 'coke' in dirs:
        dirs.remove('coke')
        dirs.remove('pepsi')
S.Lott
+2  A: 

find's -prune comes to mind, but it's a pain to get it to work for specific paths (icecream/cupcake/) rather than specific directories (cupcake/).

Personally, I'd just use cpio and hard-link (to avoid having to copy them) the files in the directories you want to keep to a new tree and then remove the old one:

find test -path 'test/icecream/cupcake/*' -o -path 'test/mtndew/livewire/*' | cpio -padluv test-keep
rm -rf test

That'll also keep your existing directory structure for the directories you intend to keep.

lhunath
+3  A: 

Move the stuff you want to keep elsewhere, then delete what's left.

Lars Wirzenius
A: 

A oneliner to solve the problem:

find . |grep -v "test/icecream/cupcake/"| grep -v "test/mtndew/livewire/"|xargs rm -r

Removed since it does not work

this might get you into trouble if have file names with space in them, and it might keep more files then you want if there are other trees that match the patterns.

A somewhat better solution:

find . |sed "s/.*/rm '&'/"|grep -v "rm './test/icecream/cupcake/"| grep -v "rm './test/mtndew/livewire/"|sh

Not actually tested, if it breaks you get to keep both parts.

Edit: As Dennis points its not only two parts that it breaks into :-) corrected the typos in the second example and removed the first

Mr Shark
In your first example, it will rm -r test/icecream which will include cupcake (for example). So, even though you've grep -v the cupcakes, they still get "eaten". The second example has a couple of typos and unbalanced single quotes. There should be a space after sed and the zero should be an ampersand.
Dennis Williamson
You still have unbalanced single quotes around the paths. For example, it should be "rm './test/icecream/cupcake/'"
Dennis Williamson
+1  A: 

This command will leave only the desired files in their original directories:

find test \( ! -path "test/mtndew/livewire/*" ! -path "test/icecream/cupcake/*" \) -delete

No need for cpio. It works on Ubuntu, Debian 5, and Mac OS X.

On Linux, it will report that it cannot delete non-empty directories, which is exactly the desired result. On Mac OS X, it will quietly do the right thing.

Patrick Webster
This finds test and deletes it, thus deleting the directories that you're trying to save.
Dennis Williamson
This finds test and does NOT delete it, because it is non-empty. Are there any modern flavors of Linux or Unix where this command fails to work?
Patrick Webster
Sorry, I made a mistake in setting up the test on my system. Your example works correctly for me.
Dennis Williamson
@Dennis Williamson.... and then Patrick left SO and never came back again.
Yar
A: 

It works for me with find using two steps: first delete the files allowed, then their empty directories!

find -x -E ~/Desktop/test -not \( -type d -regex '.*/(cupcake|livewire)/*.*' -prune \) -print0 | xargs -0 ls -1 -dG 

# delete the files first

# Mac OS X 10.4 
find -x -E ~/Desktop/test -not \( -type d -regex '.*/(cupcake|livewire)/*.*' -prune \) -type f -exec /bin/rm -fv '{}' \; 

# Mac OS X 10.5 
find -x -E ~/Desktop/test -not \( -type d -regex '.*/(cupcake|livewire)/*.*' -prune \) -type f -exec /bin/rm -fv '{}' + 

# delete empty directories 
find -x ~/Desktop/test -type d -empty -delete
On ubuntu, find doesn't have -x -E, so the equivalent would be: find ~/Desktop/test -xdev -regextype posix-extended-not \( -type d -regex '.*/(cupcake|livewire)/*.*' -prune \) -print0 | xargs -0 ls -1 -dG
Dennis Williamson
Also, it might be necessary in some cases to protect empty directories that you want to keep using the same regex in the find -empty command that's in the last step in your answer. You might want to clarify that your first command is a test for proof of concept and otherwise not part of the two steps that you are demonstrating.
Dennis Williamson
A: 

Like others I have used os.walk and os.path.join to build the list of files to delete, with fnmatch.fnmatch to select files that must be included or excluded:

#-------------------------------#
# make list of files to display #
#-------------------------------#
displayList = []
for imageDir in args :
    for root,dirs,files in  os.walk(imageDir) :
        for filename in files :
            pathname = os.path.join( root, filename ) 
            if fnmatch.fnmatch( pathname, options.includePattern ) :
                displayList.append( pathname )


#----# now filter out excluded patterns #----#
try :
    if len(options.excludePattern) > 0 :
        for pattern in options.excludePattern :
            displayList = [pathname for pathname in displayList if not fnmatch.fnmatch( pathname, pattern ) ]
except ( AttributeError, TypeError ) :
    pass

If fnmatch isn't enough, you can use the re module to test patterns.

Here I have built the file list before I do anything with it, but you could process the files as they are generated.

The try/except block...is there in case my options class instance doesn't have an exclude pattern, or if it causes an exception in fnmatch because it is the wrong type.

A limitation of this method is that it first includes files matching a pattern, then excludes. If you need more flexibility than this (include matching pattern a, but not pattern b unless pattern c...) well, then the fragment above isn't up to it. In fact, working through this exercise, you start to see why the find command syntax is the way it is. Seems clunky, but in fact it is exactly the way to do this.

But if you generate a list, you can filter it according to whatever inclusion/exclusion rules you need.

One nice thing about generating a list is you can check it before you go ahead with the deletion. This is sort of a '--dryrun' option. You can do this interactively in the python interpreter, print the list to see how it looks, apply the next filter, see if it has removed too much or too little and so on.

mataap
A: 
find /path/to/test/ -depth -mindepth 1 \
! -path "/path/to/test/icecream/cupcake/*" \
! -path "/path/to/test/icecream/cupcake" \
! -path "/path/to/test/icecream" \
! -path "/path/to/test/mtndew/livewire/*" \
! -path "/path/to/test/mtndew/livewire" \
! -path "/path/to/test/mtndew"
 -delete -print

It's a bit tedious to write all the paths to preserve but thi is the only way to use find alone.

Marco Cosentino