views:

50

answers:

2

Hey everyone. I believe this is a 'greatest-n-per-group' question but even after looking at several questions on StackOverflow, I'm unsure how to apply this to my situation...

I'm using a MySQL database and have a basic blog-type system set up about Computer Applications... The tables look like this:

POSTS
post_id
post_created
post_type      -- could be article, review, feature, whatever
post_status    -- 'a' approved or 'd' for draft

APPS
app_id 
app_name
app_platform   -- Windows, linux, unix, etc..

APP_TO_POST    -- links my posts to its relevant application
atp_id
atp_app_id
atp_post_id

I'm using the following basic query to pull all articles for the application with the name 'Photoshop' where the post type is an 'Article' and the status of the article is 'a' for approved:

SELECT apps.app_name, apps.app_platform, posts.post_created, posts.post_id
FROM apps
JOIN app_to_post ON app_to_post.atp_app_id = apps.app_id
JOIN posts ON app_to_post.atp_post_id = posts.post_id
WHERE apps.app_name = 'Photoshop'
AND 
posts.post_type = 'Article'
AND
posts.post_status = 'a'

Which gets me these expected results:

app_name    app_platform   post_created      post_id
Photoshop   Windows        Oct. 20th, 2009   1
Photoshop   Windows        Dec. 1, 2009      3
Photoshop   Macintosh      Nov. 10th, 2009   2

Would anyone be able to lend a hand on how I could alter that query to only pull the most recent article per application platform? So for example, I'd like my results to look like this:

app_name    app_platform   post_created      post_id
Photoshop   Windows        Dec. 1, 2009      3
Photoshop   Macintosh      Nov. 10th, 2009   2

And omit one of the 'Photoshop Windows' articles because it isn't the most recent one.

If I simply tack on a MAX(post_created) and a GROUP BY app_platform my results don't always group correctly. From how I understand it I need to perform some kind of inner join of a sub query?

A: 

You're on the right track.

Try adding

group by app_name,app_platform
having post_created=max(post_created)

Or if your post_id is sequential where a higher value will always reflect a later post, use this having clause: having post_id=max(post_id)

sasfrog
A: 

Since you have plenty of JOINs, I suggest creating a VIEW first:

CREATE VIEW articles AS
    SELECT    a.app_name, a.app_platform, p.post_created, p.post_id
    FROM      apps a
    JOIN      app_to_post ap ON ap.atp_app_id = a.app_id
    JOIN      posts p ON ap.atp_post_id = p.post_id
    WHERE     p.post_type = 'Article' AND p.post_status = 'a';

Then you can use a NULL-self-join:

SELECT     a1.app_name, a1.app_platform, a1.post_created, a1.post_id
FROM       articles a1
LEFT JOIN  articles a2 ON 
           a2.app_platform = a1.app_platform AND a2.post_created > a1.post_created
WHERE      a2.post_id IS NULL;

Test case:

CREATE TABLE posts (
   post_id          int,
   post_created     datetime,
   post_type        varchar(30),
   post_status      char(1)
);

CREATE TABLE apps (
   app_id           int,
   app_name         varchar(40),
   app_platform     varchar(40)
);

CREATE TABLE app_to_post (
   atp_id           int,
   atp_app_id       int,
   atp_post_id      int
);

INSERT INTO posts VALUES (1, '2010-10-06 05:00:00', 'Article', 'a');
INSERT INTO posts VALUES (2, '2010-10-06 06:00:00', 'Article', 'a');
INSERT INTO posts VALUES (3, '2010-10-06 07:00:00', 'Article', 'a');
INSERT INTO posts VALUES (4, '2010-10-06 08:00:00', 'Article', 'a');
INSERT INTO posts VALUES (5, '2010-10-06 09:00:00', 'Article', 'a');

INSERT INTO apps VALUES (1, 'Photoshop', 'Windows');
INSERT INTO apps VALUES (2, 'Photoshop', 'Macintosh');

INSERT INTO app_to_post VALUES (1, 1, 1);
INSERT INTO app_to_post VALUES (1, 1, 2);
INSERT INTO app_to_post VALUES (1, 2, 3);
INSERT INTO app_to_post VALUES (1, 2, 4);
INSERT INTO app_to_post VALUES (1, 1, 5);

Result:

+-----------+--------------+---------------------+---------+
| app_name  | app_platform | post_created        | post_id |
+-----------+--------------+---------------------+---------+
| Photoshop | Macintosh    | 2010-10-06 08:00:00 |       4 |
| Photoshop | Windows      | 2010-10-06 09:00:00 |       5 |
+-----------+--------------+---------------------+---------+
2 rows in set (0.00 sec)

As a side-note, in general you don't need a surrogate key for your junction table. You might as well set up a composite primary key (and ideally foreign keys to the referenced tables):

CREATE TABLE app_to_post (
   atp_app_id       int,
   atp_post_id      int,
   PRIMARY KEY (atp_app_id, atp_post_id),
   FOREIGN KEY (atp_app_id) REFERENCES apps (app_id),
   FOREIGN KEY (atp_post_id) REFERENCES posts (post_id)
) ENGINE=INNODB;
Daniel Vassallo