ansaurus

Question

joining latest of various usermetadata tags to user rows

Answer 1

+1 A:

I suppose you're not willing to modify your schema, so I'm afraid my answe might not be of much help, but here goes...

One possible solution would be to have the time field empty until it was replaced by a newer value, when you insert the 'deprecation date' instead. Another way is to expand the table with an 'active' column, but that would introduce some redundancy.

The classic solution would be to have both 'Valid-From' and 'Valid-To' fields where the 'Valid-To' fields are blank until some other entry becomes valid. This can be handled easily by using triggers or similar. Using constraints to make sure there is only one item of each type that is valid will ensure data integrity.

Common to these is that there is a single way of determining the set of current fields. You'd simply select all entries with the active user and a NULL 'Valid-To' or 'deprecation date' or a true 'active'.

You might be interested in taking a look at the Wikipedia entry on temporal databases and the article A consensus glossary of temporal database concepts.

Henrik Gustafsson 2008-08-24 18:11:02

Answer 2

A:

@henrik-gustafsson I didn't want to do an active flag because I'd have to make sure (programmatically) that there was only one active row for each code, but I could just put a trigger in so I just insert a new row and the DB handles marking it active and all others with the same code and userid inactive. then I can select just active tags and know I'll get one per code (unless someone/something evil has manually mucked with active flags)

adambox 2008-08-25 01:35:22

Answer 3

+4 A:

This is actually not that hard to do in PostgreSQL because it has the "DISTINCT ON" clause in its SELECT syntax (DISTINCT ON isn't standard SQL).

SELECT DISTINCT ON (code) code, content, createtime
FROM metatable
WHERE userid = 15
ORDER BY code, createtime DESC;

That will limit the returned results to the first result per unique code, and if you sort the results by the create time descending, you'll get the newest of each.

Neall 2008-08-26 00:29:36

Answer 4

A:

@neall, will that work when I want the latest of each code for all 1000 users? and how do I know how it decides which row of a given code to display? with group by, you have to resolve that ambiguity by using only aggregate functions. is the order by guaranteed to happen before the distinct on?

adambox 2008-08-27 14:08:26

Yes, the ORDER BY happens first, so you always get the record for each code with the newest createtime.

Neall 2008-09-26 12:52:40

Answer 5

A:

A subselect is the standard way of doing this sort of thing. You just need a Unique Constraint on UserId, Code, and Date - and then you can run the following:

SELECT * 
FROM Table
JOIN (
   SELECT UserId, Code, MAX(Date) as LastDate
   FROM Table
   GROUP BY UserId, Code
) as Latest ON
   Table.UserId = Latest.UserId
   AND Table.Code = Latest.Code
   AND Table.Date = Latest.Date
WHERE
   UserId = @userId

Mark Brackett 2008-08-27 14:42:34

Answer 6

A:

@mark-brackett, that's pretty much how I'm doing it now and the query takes like 10 seconds or more. the problem is that it doesn't scale with our growing userbase. I need something that doesn't take time proportional to userbase size

adambox 2008-08-27 19:16:02

ansaurus

tags:

views:

answers:

joining latest of various usermetadata tags to user rows

related questions