tags:

views:

63

answers:

3

The problem i've encountered is attempting to select rows from a database where 2 columns in that row align to specific pairs of data. IE selecting rows from data where id = 1 AND type = 'news'. Obviously, if it was 1 simple pair it would be easy, but the issue is we are selecting rows based on 100s of pair of data. I feel as if there must be some way to do this query without looping through the pairs and querying each individually. I'm hoping some SQL stackers can provide guidance.

Here's a full code break down:

Lets imagine that I have the following dataset where history_id is the primary key. I simplified the structure a bit regarding the dates for ease of reading.

table: history
history_id  id  type  user_id  date
1           1   news  1        5/1
2           1   news  1        5/1
3           1   photo 1        5/2
4           3   news  1        5/3
5           4   news  1        5/3
6           1   news  1        5/4
7           2   photo 1        5/4
8           2   photo 1        5/5

If the user wants to select rows from the database based on a date range we would take a subset of that data.

SELECT history_id, id, type, user_id, date FROM history WHERE date BETWEEN '5/3' AND '5/5'

Which returns the following dataset

history_id  id  type  user_id  date
4           3   news  1        5/3
5           4   news  1        5/3
6           1   news  1        5/4
7           2   photo 1        5/4
8           2   photo 1        5/5

Now, using that subset of data I need to determine how many of those entries represent the first entry in the database for each type,id pairing. IE is row 4 the first time in the database that id: 3, type: news appears. So I use a with() min() query.

In real code the two lists are programmatically generated from the result sets of our previous query, here I spelled them out for ease of reading.

WITH previous AS (
  SELECT history_id, id, type FROM history WHERE id IN (1,2,3,4) AND type IN ('news','photo')
) SELECT min(history_id) as history_id, id, type FROM previous GROUP BY id, type

Which returns the following data set.

history_id  id  type  user_id  date
1           1   news  1        5/1
2           1   news  1        5/1
3           1   photo 1        5/2
4           3   news  1        5/3
5           4   news  1        5/3
6           1   news  1        5/4
7           2   photo 1        5/4
8           2   photo 1        5/5

You'll notice it's the entire original dataset, because we are matching id and type individually in lists, rather than as a collective pairs.

The result I desire is, but I can't figure out the SQL to get this result.

history_id  id  type  user_id  date
1           1   news  1        5/1
4           3   news  1        5/3
5           4   news  1        5/3
7           2   photo 1        5/4

Obviously, I could go the route of looping through each pair and querying the database to determine it's first result, but that seems an inefficient solution. I figured one of the SQL gurus on this site might be able to spread some wisdom.

In case I'm approaching this situation incorrectly, the gist of the whole routine is that the database stores all creations and edits in the same table. I need to track each users behavior and determine how many entries in the history table are edits or creations over a specific date range. Therefore I select all type:id pairs from the date range based on a user_id, and then for each pairing I determine if the user is responsible for the first that occurs in the database. If first, then creation else edit.

Any assistance would be awesome.

A: 
SELECT * 
FROM HISTORY,
    (SELECT    MIN(date) 'min_date', id, type
     FROM      history
     WHERE     id IN (1,2,3,4) AND type IN ('news','photo')
     -- AND DATE BETWEEN xxx and YYY
     GROUP BY  id, type) 'min_dates'
WHERE HISTORY.id     = min_dates.id
 AND  HISTORY.type   = min_dates.type
 AND  HISTORY.date   = min_dates.min_date

This is not tested as I don't have DB access at the moment, sorry

DVK
A: 
SELECT 
      h1.history_id,
      h1.id,
      h1.type,
      h1.user_id,
      h1.date
   FROM 
      ( select 
              h2.id, 
              MIN( h2.history_id ) minHistory 
           from 
              history h2
           group by
              h2.id ) ByType,
       history h1
   where 
      ByType.MinHistory = h1.History_ID

This will query the entire system wide, regardless of dates. However, you can your your WHERE criteria on the inner query "from History h2" to limit date range or id types or type descriptions.

Since the inner query will be done first, and obviously have less records, that will be used as the primary to join the FULL history table. But since its only based on the single History ID record, only that critical record being "First" will be returned as you are hoping to get.

DRapp
+1  A: 

Don't see the need for two queries... DVK got the idea right tho

select id, type, MIN(date) as 'min_date'
from history
where date between YOUR_START_DATE and YOUR_END_DATE
group by id, type
intnick