tags:

views:

118

answers:

4

I have a tree like structure created out of models using ForeignKey linkages. As an example:

Model Person:  
   name = CharField  

Model Book:  
   name = CharField  
   author = FK(Person)  

Model Movie:  
   name = CharField  
   director = FK(Person)  

Model Album:  
   name = CharField  
   director = FK(Person)  

Model Chapter:  
   name = CharField  
   book = FK(Book)  

Model Scene:  
   name = CharField  
   movie = FK(Movie)  

Model Song:  
   name = CharField  
   album = FK(Album)

The caveat here is the real structure is both deeper and broader, and a node might have multiple non-FK fields (ie not just 'name').

What I'd like to do is have a search function such that there is a string supplied which will return any Person that either matches a Person object itself, or a field in any of the child nodes. IOW, if "beat it" is the string, and the name of a song associated with an album associated w/ the person matches, the person will be returned.

What I've done so far is the following:

For any leaf node, have a Manager object w/ a search method which does something like:

return Song.objects.filter(name__icontains=search_string)

Then for the root node (Person) and any interior nodes, there is also a Manager object w/ a search() method which looks something like:

class AlbumManager(models.Model):  
    def search(self, search_string):  
        from_songs = Album.objects.filter(song__in=Song.objects.search(search_string))  
        return Album.objects.filter(name__icontains=search_string)|from_songs

As you might imagine, once you get to the root node, this unleashes a massive number of queries and is really inefficient. I'd be pretty surprised if there wasn't a better way to do this ... one idea is to just have one search() method at the top of the tree that manually searches through everything, but a) that seems very messy (albeit probably more efficient) and b) it would be nice to be able to search individual nodes arbitrarily.

So with all of this said, what would be a more efficient method of getting where I want to be instead of my bonehead method here?

+1  A: 

Perhaps it would be better to outsource the search functionality to a dedicated tool like Woosh or Sphinx. There are projects for both search frameworks that integrate them with Django (django-haystack and django-sphinx, respectively). You will get much better performance than with complicated SQL queries with many joins and subqueries.

piquadrat
geoffjentry
+1  A: 

Django Solr is another option.

tosh
+1  A: 

I don't think anything you can do with the basic ORM is going to be truly efficient, you are basically going to be running a series of filter() queries on different models with a hell of a lot of JOINs.

If you were aiming at high performance, long term solutions, your best bet would be to create an object repository in memory. I don't see any way in basic SQL you could make those queries efficient, especially if you are working with a ton of rows.

Creating a denormalized table for your child objects (chapter, song, etc) that contains all your Person and Book/Album data might be a potential mid-term solution, and that will require some fancy SQL footwork that is probably not part of Django Core.

Jasconius
I thought about the denormalization, but unfortunately I don't really have the ability to go monkeying around with the table structure in this part of the DB. Figured there weren't going to be great options with basic Django ORM though.
geoffjentry
Yeah, Denormalization is foreign to Django, in fact for as easy as the ORM makes things, there are some ultra-large scale performance issues with it, especially if you want to do complex search. Look up Cal Henderson's keynote at DjangoCon, he talks about some of these things. It's a good listen for anyone using Django.
Jasconius
Thanks for the pointer. For me, the ORM is a net positive for me - the DB itself is sub-optimal for the ORM but overall traffic is so low that it balances out (and in a few spots I don't use the ORM) ... given how much easier it makes my life I'll take it though. :) This just happens to be one of those gotcha spots.
geoffjentry
+1  A: 

Well, if you don't want to set-up a search engine like whoosh then you could use Q

http://docs.djangoproject.com/en/dev/topics/db/queries/#complex-lookups-with-q-objects

This would involve hardcoding the fields a bit and not simply going through all of them.

>>> q = Q()
>>> q = q | Q( fk__fk__field__icontains = searchTerm )
>>> q = q | ...
...
>>> qs = Model.objects.filter(q)
skyl
That's basically what I am doing. Once it started getting really deep, that's when I went to encapsulating a mini-search at every level though, go as you move towards the root of the tree the Q calls are involving other search() methods. Just doing Q at the root level and going into the fk's manually is faster, but still pretty rough.
geoffjentry