(Sorry the title isn't more help. I have a database of media-file URLs that came from two sources: RSS feeds and manual entries. I want to find the ten most-recently added URLs, but a maximum of one from any feed. To simplify, table 'urls' has columns 'url, feed_id, timestamp'. feed_id='' for any URL that was entered manually. How would I write the query? Remember, I want the ten most-recent urls, but only one from any single feed_id.
You probably want a union. Something like this should work:
(SELECT
url, feed_id, timestamp
FROM rss_items
GROUP BY feed_id
ORDER BY timestamp DESC
LIMIT 10)
UNION
(SELECT
url, feed_id, timestamp
FROM manual_items
GROUP BY feed_id
ORDER BY timestamp DESC
LIMIT 10)
ORDER BY timestamp DESC
LIMIT 10
MySQL doesn't have the greatest support for this type of query.
You can do it using a combination of "GROUP-BY" and "HAVING" clauses, but you'll scan the whole table, which can get costly.
There is a more efficient solution published here, assuming you have an index on group ids: http://www.artfulsoftware.com/infotree/queries.php?&bw=1390#104
(Basically, create a temp table, insert into it top K for every group, select from the table, drop the table. This way you get the benefit of the early termination from the LIMIT clause).
Would it work to group by the field that you want to be distinct?
SELECT url, feedid FROM urls GROUP BY feedid ORDER BY timestamp DESC LIMIT 10;
Assuming feed_id = 0 is the manually entered stuff this does the trick:
select p.* from programs p
left join
(
select max(id) id1 from programs
where feed_id <> 0
group by feed_id
order by max(id) desc
limit 10
) t on id1 = id
where id1 is not null or feed_id = 0
order by id desc
limit 10;
It works cause the id column is constantly increasing, its also pretty speedy. t is a table alias.
This was my original answer:
(
select
feed_id, url, dt
from feeds
where feed_id = ''
order by dt desc
limit 10
)
union
(
select feed_id, min(url), max(dt)
from feeds
where feed_id <> ''
group by feed_id
order by dt desc
limit 10
)
order by dt desc
limit 10
Assuming this table
CREATE TABLE feed (
feed varchar(20) NOT NULL,
add_date datetime NOT NULL,
info varchar(45) NOT NULL,
PRIMARY KEY (feed,add_date);
this query should do what you want. The inner query selects the last entry by feed and picks the 10 most recent, and then the outer query returns the original records for those entries.
select f2.*
from (select feed, max(add_date) max_date
from feed f1
group by feed
order by add_date desc
limit 10) f1
left join feed f2 on f1.feed=f2.feed and f1.max_date=f2.add_date;
Here's the (abbreviated) table:
CREATE TABLE programs (
id int(11) NOT NULL auto_increment,
feed_id int(11) NOT NULL,
`timestamp` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
PRIMARY KEY (id)
) ENGINE=InnoDB;
And here's my query based on sambo99's concept:
(SELECT feed_id,id,timestamp
FROM programs WHERE feed_id=''
ORDER BY timestamp DESC LIMIT 10)
UNION
(SELECT feed_id,min(id),max(timestamp)
FROM programs WHERE feed_id<>'' GROUP BY feed_id
ORDER BY timestamp DESC LIMIT 10)
ORDER BY timestamp DESC LIMIT 10;
Seems to work. More testing needed, but at least I understand it. (A good thing!). What's the enhancement using the 'id' column?