views:

40

answers:

3

I searched all over place for an answer to this but couldn't find anything. Perhaps this is just a stupid question or a really tricky one. Here it is:

Let's say my model is this (pseudo django code):

Event
  type = ForeignKey(EventType)
  name = CharField
  date_start = DateField
  date_end = DateField

EventType
  name = CharField

What I want to know is the average duration time for each event type. What I do now is calculate the average duration whenever a new event is created (save method) and have that stored in an average_duration column in EventType. The problem with this approach is that I cannot answer questions like "what was the average duration time for events of type X, during the year Y". So instead of adding more columns to answer questions like these I would prefer to have it done in "real-time".

Can this be done by annotating the queryset? First I would have to get the date differences for each event type, then come up with their average, and then annotate the Event queryset with that average, I assume.

+1  A: 

I think your best bet is to create an SQL view with the date_end - date_start column, create a django model on this view and then you will be able to query the view and annotate it as you want. I've done this with models similars to yours and maybe I could extract some interesting code for you if you need.

Ghislain Leveque
Yes if you could provide some code examples it would be great, thanks
snz3
+1  A: 

You'll need to create a queryset with the extra method to add the date difference to each row

Then use the aggregate method to compute the average for your just added column:

Be careful though, this method is slow and won't scale. Storing the computed value on event_type is imho your best option.

Guillaume Esquevin
+2  A: 

I would suggest that you store the event duration as a column :

event_duration = models.IntegerField()
...

def __init__(self, *args, **kwargs):
    super(Event, self).__init__(*args, **kwargs)
    self.update_event_duration()


def save(self, **kwargs):
    self.update_event_duration()
    super(Event, self).save(*args, **kwargs)

Then, you can use this library : http://code.google.com/p/django-cube/ to calculate the average on several different dimensions (year, type, or other) :

>>> def my_avg(queryset):
...    return queryset.aggregate(Avg("event_duration"))["event_duration__avg"]

>>> c = Cube(["date_start__year", "type"], Event.objects.all(), my_avg)

Use the cube like this :

>>> c.measure(date_start__year=1999, type=event_type2)
123.456

Or you can get all the averages on all years :

>>> c.measure_dict("date_start__year")
{1984: {'measure': 111.789}, 1985: {'measure': 234.666}, ...}

Or by year/event type :

>>> c.measure_dict("date_start__year", "type")
{
    1984: {eventtype1: {'measure': 111.789}, eventtype2: {'measure': 234.666}, ...},
    1985: {eventtype1: {'measure': 122.79}, eventtype2: {'measure': 233.444}, ...},
    ...
}
sebpiq