views:

890

answers:

8

Hi,

I have a folder with thousands of images. I want to delete every other image. What is the most effective way to do this? Going through each one with i%2==0 is still O(n). Is there a fast way to do this (preferably in Python)?

Thx

+19  A: 

To delete half the N images you cannot be faster than O(N)! You do know that the O() notation means (among other things) that constant multiplicative factors are irrelevant, yes?

Alex Martelli
careful with that exclamation point there... I'm dyslexic and initially read it as "cannot be faster than O(N!)" which is obviously wrong... One should never get excited when doing math. you run the risk of exponential growth B-)
Brian Postow
^^ "One should never get excited when doing math" lol
wilhelmtell
+2  A: 

I fail to see any conceivable way in which deleting n/2 files could be faster than O(n), unless the filesystem has some special feature for deleting large numbers of files (but I don't think that actually exists in practice, if it's even possible)

David Zaslavsky
+1  A: 

If you wanted to delete Log(n) files, there would be... You can store images in a database, though ( MySQL has a "blob" type, among several others, that will store your images). Then you could do it in O(1) if you named them smartly.

/edit i hate how i have to use shorthand and bad grammar to get my answers in quickly!!!

if you're looking for a python equivalent of rm -rf *2.img *4.img *6.img *8.img *0.img, know that the computer still has to go through the entire list of files

ryansstack
+3  A: 

Going through each one with i%2==0 is still O(n). Is there a fast way to do this (preferably in Python)?

The only way to be faster than O(n) is if your files are already sorted, and you only want to delete 1 file.

You said i%2==0, this means you are deleting every "even" file. O(n/2) is still O(n)

Unknown
A: 

"Going through each one with i%2==0 is still O(n)"

Increment by 2 instead of incrementing by 1?

for(i = 0; i < numFiles; i += 2) {
  deleteFile(files[i]);
}

Seriously though: iterating through a list of files probably isn't the slowest part of your file deletion algo. The actual deletion likely takes several orders of magnitude more time.

Frank Farmer
+1  A: 

You could use islice from the itertools module. Here goes your example:

import os, itertools
dirContent = os.listdir('/some/dir/with/files')
toBeDeleted = itertools.islice(dirContent, 0, len(dirContent), 2)
# Now remove the files
[os.unlink(file) for file in toBeDeleted]

This is another form of doing what you want, although I'm not sure if it'll be faster. Hope this helps.

Sergio
+11  A: 
import os
l = os.listdir('/some/dir/with/files')

for n in l[::2]:
    os.unlink(n)
mtasic
A: 

I would try to use something operating-system specific like:

linux:

@files = grep { -f "$dir/$_" && /*.H$/ }
unlink @files

Win:

$file_delete =~ /H$/;
rm $file_delete

to see if your os can do it faster than iterating in python.

use os.system(...) or subprocess.call(...) to run these from python.

bpowah
Those are still O(n).
akappa