views:

149

answers:

5

Using MYSQL I would like to refactor the following SELECT statement to return the entire record containing the newest invoice_date:

> SELECT id, invoice, invoice_date
  FROM invoice_items
  WHERE lot = 1047

id    invoice_id   invoice_date
-----------------------------------
3235    1047      2009-12-15 11:40:00
3295    1047      2009-12-15 16:00:00
3311    1047      2009-12-15 09:30:00
3340    1047      2009-12-15 13:50:00

Using the MAX() aggregate function and the GROUP BY clause gets me part of the way there:

> SELECT id, invoice_id, max(invoice_date)
  FROM invoice_items
  WHERE invoice_id = 1047
  GROUP BY invoice_id


id    invoice_id   invoice_date
-----------------------------------
3235    1047      2009-12-15 16:00:00

Notice that the query appears to get the MAX(invoice_date) correctly, but the id returned (3235) is not the id of the record containing the MAX(invoice_date) (3295) it is the id of the first record in the initial query.

How do I refactor this query to give me the the entire record that contains the MAX(invoice_date)?

The solution must use the GROUP BY clause, because I need to get newest invoice_date for each invoice.

+1  A: 

I'm assuming that since the table name is invoice_items that there would be multiple rows for a given invoice, so you should probably use something like this:

SELECT * FROM invoice_items 
WHERE invoice_date IN (SELECT MAX(invoice_date) FROM invoice_items)

If you aren't concerned about two records having the same invoice date, you could just do this:

SELECT * FROM invoice_items
ORDER BY invoice_date DESC
LIMIT 1
Eric Petroelje
A: 
SELECT * 
FROM invoice_items
WHERE lot = 1047
ORDER BY invoice_date desc LIMIT 1

or better if your id is your primary key and always growing

SELECT * 
FROM invoice_items
WHERE lot = 1047
ORDER BY id desc LIMIT 1
Patrick
+1  A: 

pretty much exactly the way you'd say it in English

"Get me the invoice with the latest Invoice Date"

Select * From invoice_items
Where invoice_date =
   (Select Max(invoice_date)
    From invoice_items)

But something is wrong in your schema I think. Since there are multiple rows with the same Invoice_Id, this looks like an Invoice Details or Invoice line items table, (not an Invoice Table). And if so, how can each line item within the same invoice have different InvoiceDates"? If these are different, then they are not invoice dates, they are invoice detail dates, (whatever that means) and should be labeled as so..

Charles Bretana
+2  A: 

This is the often-repeated "greatest-n-per-group" problem.

Here's how I would solve it in MySQL:

SELECT i1.*
FROM invoice_items i1
LEFT OUTER JOIN invoice_items i2
  ON (i1.invoice_id = i2.invoice_id AND i1.invoice_date < i2.invoice_date)
WHERE i2.invoice_id IS NULL;

Explanation: for each row i1, try to find a row i2 with the same invoice_id and a greater date. If none are found (i.e. i2 is all nulls because of the outer join), then i1 must be the row with the greatest date for its invoice_id.

This solution using join tends to work better for MySQL, which is weak when optimizing both GROUP BY and subqueries.

Bill Karwin
This seems to work as desired however it is incredibly slow. Can this really be a better solution than using GROUP BY on MySQL?
rswolff
Try both and see. Don't forget to analyze both queries with `EXPLAIN`.
Bill Karwin
+1  A: 

Here's my attempt:

SELECT t1.*
FROM INVOICE_ITEMS t1,
   (SELECT INVOICE_ID, MAX(INVOICE_DATE) as invoice_date2
      FROM INVOICE_ITEMS
     GROUP BY INVOICE_ID) t2
WHERE t1.invoice_id = t2.invoice_id
AND t1.invoice_date = t2.invoice_date2
Chris Persichetti