views:

256

answers:

3

I'm trying to find out if an Element in a Django model exists. I think that should be very easy to do, but couldn't find any elegant way in the Making queries section of the Django documentation.

The problem I have is that I've thousands of screenshots in a directory and need to check if they are in the database that is supposed to store them. So I'm iterating over the filenames and want to see for each of them if a corresponding element exists. Having a model called Screenshot, the only way I could come up with is

filenames = os.listdir(settings.SCREENSHOTS_ON_DISC)
for filename in filenames:
    exists = Screenshot.objects.filter(filename=filename)
    if exists:
        ...

Is there a nicer/ faster way to do this? Note that a screenshot can be in the database more than once (thus I didn't use .get).

+1  A: 

You could try:

Screenshot.objects.filter(filename__in = filenames)

That will give you a list of all the screenshots you do have. You could compare the two lists and see what doesnt exist between the two. That should get you started, but you might want to tweak the query for performance/use.

Alex Jillard
+2  A: 

If your Screenshot model has a lot of attributes, then the code you showed is doing unnecessary work for your specific need. For example, you can do something like this:

files_in_db = Screenshot.objects.values_list('filename', flat=True).distinct()

which will give you a list of all filenames in the database, and generate SQL to only fetch the filenames. It won't try to create and populate Screenshot objects. If you have

files_on_disc = os.listdir(settings.SCREENSHOTS_ON_DISC)

then you can iterate over one list looking for membership in the other, or make one or both lists into sets to find common members etc.

Vinay Sajip
+1  A: 

This query gets you all the files that are in your database and filesystem:

discfiles = os.listdir(settings.SCREENSHOTS_ON_DISC)

filenames = (Screenshot.objects.filter(filename__in=discfiles)
                               .values_list('filename', flat=True)
                               .order_by('filename')
                               .distinct())

Note the order_by. If you have an ordering specified in your model definition, then using distinct may not return what you expect. This is documented here:

So make the ordering explicit, then execute the query.

ars