tags:

views:

77

answers:

2

I am using the following model with Django:

class Hit(Model):
    image = ForeignKey(Image)
    user = ForeignKey(User)
    timestamp = DateTimeField(auto_now_add = True)

What I need is basically a list that contains the count of "first hits" (i.e. hits with no earlier timestamp for the same image) for every user to create sort of a rank list.

Or still easier, just a list that contains a user name one time for every time this user has made a "first hit".

In SQL using the PostgreSQL "DISTINCT ON" extension, this would be a simple query like:

SELECT DISTINCT ON (image_id) user_id FROM proj_hit ORDER BY image_id ASC, timestamp ASC;

It there a way, to get this result with Django's ORM or (at least) portable SQL, i.e. no PostgreSQL extensions?

A: 

I'm pretty sure that the portable SQL version is very similar to the version that you posted — simply drop the ON:

SELECT DISTINCT image_id, user_id FROM proj_hit ORDER BY image_id ASC, timestamp ASC;
Ben Hodgson
Well, in SQLite this works, however in PostgreSQL not: "ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list" - Django is doing the same thing: If you call distinct() on a query set that has been order_by()ed before, the ordering columns are automatically added to the distinct set. For me, this means, that I would not get "first hits" but "all hits" since the timestamp differs between the hits and so they are treated as different according to distinct.
ChrisM
+3  A: 

Are you at liberty to make a change to your model? It would help to de-normalize and store the first hit information in the same model or as part of a different model.

For e.g.

class Hit(Model):
    image = ForeignKey(Image)
    user = ForeignKey(User)
    timestamp = DateTimeField(auto_now_add = True)
    is_first_hit = BooleanField(default = False)

You can then override the save() method (or tap a signal) to set the is_first_hit explicitly on save. This would make inserts and updates a little more expensive but would make querying very easy.

Manoj Govindan
Surprisingly I haven't thought of that, but it's a really good idea. The additional BooleanField won't hurt and denormalizing will neither because my app is the only one editing the database and I can still write a consistency checking script if I want. Thank you!
ChrisM