tags:

views:

97

answers:

5

I have a table with the following structure:

id           -int(11)
event_id     -int(11)
photo_id     -int(11)
created_at   -datetime

How do I write a query that will return the 100 most recent rows, but insuring that there are no more than 4 consecutive rows with the same value in photo_id

A: 

I'd say something like this will put you in the right track:

$sql = "SELECT DISTINCT * FROM myTable ORDER BY id ASC LIMIT 100";

In this case "DISTINCT" will retrieve only diferent rows and ingore the repeated ones.

Hope it helps.

jnkrois
+2  A: 

You could add a where clause that filters out rows for which 4 rows with lower photo_id's exist:

select *
from YourTable t1
where 4 > (
    select count(*)
    from YourTable t2
    where t1.event_id = t2.event_id
    and t1.photo_id < t2.photo_id
)
limit 100

This can get kind of slow for huge tables. A faster, but very MySQL specific option is to use variables. For example:

select *
from (
    select
        @nr := case 
            when event_id = @event then @nr + 1 
            else 1 
        end as photonr
    ,   @event := event_id
    ,   t1.*
    from YourTable as t1
    cross join (select @event := -1, @nr := 1) as initvars
    order by event_id
) as subquery
where subquery.photonr < 5
limit 100;

Test data used:

drop table if exists YourTable;

create table YourTable (
  id int auto_increment primary key
, event_id int
, photo_id int
);

insert into YourTable (event_id, photo_id)
values (1,1), (1,2), (1,3), (1,4), (1,5), (2,1), (1,6);
Andomar
A: 

in oracle, you would use the lag function

LAG  (value_expression [,offset] [,default]) OVER ([query_partition_clause] order_by_clause)

not sure that is possible in mySQL.

Randy
MySQL does not support `lag`. Even if it did, I wonder how you would answer this question using it
Andomar
A: 

If your using T-SQL, check out http://msdn.microsoft.com/en-us/library/ms189798.aspx for Ranking Functions.

From your question it looks like NTILE is what you want. Here's my quick attempt at the query, I'm not at a terminal so it's not checked, but it should get you started:

SELECT
  id,
  event_id,
  photo_id,
  created_at,
  NTILE(4) OVER (ORDER BY photo_id) AS 'Quartile'
FROM tbl
WHERE NTILE(4) OVER (ORDER BY photo_id)<2
ORDER BY created_at DESC

The linked page has a good example of all the ranking functions.

Good luck

Stin
The question is about MySQL (see tags); SQL Server doesn't allow `ntile` in a `where` clause; and `ntile` is useful for selecting the top 25% of photo's, not for a fixed number :)
Andomar
A: 

Try this:

SELECT p.id, p.event_id, p.photo_id, p.created_at
FROM photo_table p,
    (

        SELECT photo_id, MAX(created_at) max_date
        FROM photo_table
        GROUP BY photo_id 
    ) t
WHERE p.created_at = t.max_date
        AND p.photo_id = t.photo_id
ORDER BY p.created_at DESC
LIMIT 100

What it does is: 1. find latest photo change date 2. find only last events of each photo 3. select first 100 most recent

In PostgreSQL or Oracle it would be simpler by using analytica/windowing functions, such as:

FIRST (created_at) OVER (PARTITION BY photo_id ORDER BY created_at DESC)
Stiivi