ansaurus

Question

Select newest entry from a joined MySQL table

Answer 1

A:

There are two common ways to accomplish this: a sub query or a self-join.

See this example of selecting the group-wise maximum at the MySQL site.

Edit, an example using a subquery:

SELECT stock.stockid, stock.sku, stock.quantity,
       stockfile.filename, stockfile.date
FROM stock
INNER JOIN stockfile ON stock.stockfileid = stockfile.stockfileid
WHERE stockfile.date = (SELECT MAX(date) FROM stockfile);

charstar 2009-12-17 01:04:08

Answer 2

A:

select *
from   stock
where  stockfileid in (
           select top 1 stockfileid
           from   stockfile
           order by date desc
       )

Rich 2009-12-17 01:04:41

That won't work as is - you need to correlate the subquery. Currently the subquery will return only one row, and you're getting the stockfileid, not the date which is what you're really after.

OMG Ponies 2009-12-17 01:08:28

This is not valid SQL for MySQL: http://dev.mysql.com/doc/refman/5.4/en/select.html

charstar 2009-12-17 01:08:53

That too - need to use `LIMIT`, not `TOP`

OMG Ponies 2009-12-17 01:09:19

I suppose that's what I get for being a SQL Server user...

Rich 2009-12-17 01:11:39

So is the problem that each stock file doesn't contain a complete stock check so the latest stock number for a particular product might be in any of the files? The question doesn't seem entirely clear.

Rich 2009-12-17 01:16:47

Indeed, not everything is in the newest stockfile. But even if it were I'd still need to get the latest stock quantity like this.

skerit 2009-12-17 01:22:15

Answer 3

+2 A:

Use:

SELECT DISTINCT s.stockid, 
       s.sku, 
       s.quantity, 
       sf.filename, 
       sf.date
  FROM STOCK s
  JOIN STOCKFILE sf ON sf.stockfileid = s.stockfileid
  JOIN (SELECT t.stockfileid,
               MAX(t.date) 'max_date'
          FROM STOCKFILE t
      GROUP BY t.stockfileid) x ON x.stockfileid = sf.stockfileid
                               AND x.max_date = sf.date

OMG Ponies 2009-12-17 01:06:52

I'm afraid this still gives me all 315.000 records.

skerit 2009-12-17 01:17:09

I updated to add the `DISTINCT`, because the most likely reason is the rows are being duplicated because of the JOINs.

OMG Ponies 2009-12-17 01:19:53

Hmm, even with the distinct it still gives me all the duplicate records.

skerit 2009-12-17 01:23:51

@Jelle: That's impossible - DISTINCT is a synonym for GROUP BY - all the rows returned will be unique in some way, even if it is one column value. The most likely reason is that you've got ties based on the maximum date.

OMG Ponies 2009-12-17 01:31:33

I haven't defined any relations or ties in the database.

skerit 2009-12-17 01:46:35

@skerit: ties == similar, have two or more matched records. not tie as in link

Hao 2009-12-17 02:09:53

Answer 4

+1 A:

This is an example of the frequently-asked "greatest-n-per-group" question that we see every week on StackOverflow. Follow that tag to see other similar solutions.

SELECT s.*, f1.*
FROM stock s
INNER JOIN stockfile f1
  ON (s.stockfileid = f1.stockfileid)
LEFT OUTER JOIN stockfile f2
  ON (s.stockfileid = f2.stockfileid AND f1.date < f2.date)
WHERE f2.stockfileid IS NULL;

If there are multiple rows in stockfile that have the max date, you'll get them both in the result set. To resolve this, you'd have to add some tie-breaker conditions into the join on f2.

Thanks for adding the CREATE TABLE info. That's very helpful when you're asking SQL questions.

I see from the AUTO_INCREMENT table options that you have 315k rows in stock and only 265 rows in stockfile. Your stockfile table is the parent in the relationship, and the stock table is the child, with a column stockfileid that references the primary key of stockfile.

So your original question was misleading. You want the latest row from stock, not the latest row from stockfile.

SELECT f.*, s1.*
FROM stockfile f
INNER JOIN stock s1
  ON (f.stockfileid = s1.stockfileid)
LEFT OUTER JOIN stock s2
  ON (f.stockfileid = s2.stockfileid AND (s1.touchdate < s2.touchdate
      OR s1.touchdate = s2.touchdate AND s1.stockid < s2.stockid))
WHERE s2.stockid IS NULL;

I'm assuming you want "latest" to be relative to touchdate, so if you want to use creationdate instead, you can do the edit.

I've added a term to the join so that it resolves ties. I know you said the dates are "practically unique" but as the saying goes, "one in a million is next Tuesday."

Okay, I think I understand what you're trying to do now. You want the most recent row per sku, but the date by which to compare them is in the referenced table stockfile.

SELECT s1.*, f1.*
FROM stock s1
JOIN stockfile f1 ON (s1.stockfileid = f1.stockfileid)
LEFT OUTER JOIN (stock s2 JOIN stockfile f2 ON (s2.stockfileid = f2.stockfileid))
  ON (s1.sku = s2.sku AND (f1.date < f2.date OR f1.date = f2.date AND f1.stockfileid < f2.stockfileid))
WHERE s2.sku IS NULL;

This does a self-join of stock to itself, looking for a row with the same sku and a more recent date. When none is found, then s1 contains the most recent row for its sku. And each instance of stock has to join to its stockfile to get the date.

Re comment about optimization: It's hard for me to test because I don't have tables populated with data matching yours, but I'd guess you should have the following indexes:

CREATE INDEX stock_sku ON stock(sku);
CREATE INDEX stock_stockfileid ON stock(stockfileid);
CREATE INDEX stockfile_date ON stockfile(date);

I'd suggest using EXPLAIN to analyze the query without the indexes, and then create one index at a time and re-analyze with EXPLAIN to see which one gives the most direct benefit.

Bill Karwin 2009-12-17 01:32:14

I know, I've actually solved a similar problem thanks to stackoverflow. But your query still gives me 315.000 results :/

skerit 2009-12-17 01:37:55

Then I would guess there are a lot of ties for the max date per stock.

Bill Karwin 2009-12-17 01:43:17

A lot of ties? The "date" field is practically unique in the stockfile table.Per date there can be about 100 records in the stock table

skerit 2009-12-17 01:48:52

Alright, SHOW CREATE TABLE output has been added.

skerit 2009-12-17 02:13:03

Hmm, but I can't use the touchdate from the "stock" table, the only date that matters is the "date" field in stockfile.(touchdate and creationdate are fields to let me know when it was originally created and last edited.A stockfile from last year can be re-imported today, which would results in old records having the highest touchdate)

skerit 2009-12-17 10:56:39

See my third try above.

Bill Karwin 2009-12-17 16:50:16

skerit 2009-12-17 20:08:32

Ah yes, unfortunately I forgot I already made those indexes. So without them it takes ages, with them it takes "only" 250 seconds. Maybe I'll better create a new question to not go off-topic too much?

skerit 2009-12-17 22:08:05

ansaurus

tags:

views:

answers:

Select newest entry from a joined MySQL table

related questions