Hey there,
I have a CATEGORIES table that links to an ITEMS table (one to many). Each item table could have multiple barcodes in the BARCODES table (one to many).
The idea is that a new barcode may be added against an item at some point, but the old data is stored in the BARCODES table so that if a search is done with an order with last year's data, the item and category of the old barcode can still be found quickly.
Hope that makes sense - it's a bit of an oversimplification.
Example of my tables:
CATEGORIES TABLE
ID NAME CODE ACCOUNTING_REFERENCE
1 Beverages BEV Stock_Beverages
2 Pies PIE Stock_Pies
3 Chips CHP Stock_Chips
ITEMS TABLE
ID CATEGORY_ID DESCRIPTION BASE_COST
1 1 Red Bull (single) $4.50
2 2 Ponsonby Pie - Mince Cheese $2.99
3 1 Coke Can (single) $3.50
4 2 Big Ben - Steak Pepper $1.99
BARCODES TABLE
ID ITEM_ID BARCODE ACTIVE_FROM
1 1 XSD123 2009/10/11
2 2 AXF123 2009/10/12
3 3 XYZ234 2009/10/11
4 1 NEW001 2010/01/05
5 1 NEW002 2010/01/05*
- I know it doesn't make sense in this scenario to have two barcodes entered on the same day. Ignore that this looks like bad data. This example is just an analogy for what I'm actually doing - in the real world application that mirrors this structure, it makes sense to have two 'BARCODES' entered on the same day. I'd like to discuss the real data openly, but I'm not free to do so - I'm sorry if it's confusing.
I am attempting to display item information to the user. I want to return from SQL one record per item, for all items. The returned information should include category name, code, and accounting references for the lines - duplicates of these columns are fine. The query needs to return item description and base cost.
This much I can do.
I also want the query to return the most recent barcode for the item. In the cases where there are two barcodes entered for an item on the same day, it isn't hugely important which one I display - although displaying the higher of the two (or more) BARCODE.ID fields would be a nice touch.
Also, it is possible that items may exist without any barcodes against them, in which case I want to return a null entry.
My desired result set would look something like this:
ITEM_ID DESCRIPTION BASE_COST CATEGORY_NAME CATEGORY_CODE ACCOUNTING_REFERENCE BARCODE
1 Red Bull (single) $4.50 Beverages BEV Stock_Beverages NEW002
2 Ponsonby Pie - Mince Cheese $2.99 Pies PIE Stock_Pies AXF123
3 Coke Can (single) $3.50 Beverages BEV Stock_Beverages XYZ234
4 Big Ben - Steak Pepper $1.99 Pies PIE Stock_Pies <null>
I can't work out how to add the BARCODE column to this result set, given the tables above.
In case I haven't been clear: I need to know how to structure a SQL query that will give me exactly the result above, given the data I've presented.
It's a requirement that the ITEM_ID column in my result set be distinct. I can't just restrict the stale entries in memory, and if I bubble up multiple ITEM_ID's it breaks a PK/FK relationship used elsewhere in the application I'm touching up.
For the record, I've had a look around the site on anything related to distinct sql columns to answer this question. I found a lot of entries, but couldn't seem to get any of the suggested solutions to work for me. Perhaps I'm just dense - it's getting late, and I may not be thinking straight. Apologies if I've missed something obvious.
EDIT
Gabe gave a good answer below, and I realized I should have been clearer.
Gabe's answer doesn't work for me, because occasionally in my data I have two barcodes against the same item with the same timestamp. When I tried his code, I wound up with multiple items being returned whenever an item had two barcodes placed against it on the same day.
This is really counter-intuitive, so it's my fault for not communicating properly. Essentially, the scenario described above isn't actually the data I'm working from. I'm not at liberty to discuss the database I'm working on publicly, so I have to rename everything in the examples I use.
I've adjusted the problem above. I know it seems ridiculous to have two barcodes entered with the same timestamp - rest assured that it makes sense that this could happen in the actual database.
EDIT (again)
The answers of both Gabe and simon work. I chose Gabe's as the answer, simply because I found his statement the most legible.
That said, I also like simon's because it shows a syntax for using SQL that I'm not familiar with - it's good to see an unfamiliar syntax in action.
At some point I need to benchmark the two methods to see which is faster. With my sample data, they're currently equal - although I expect I could change that with a bit of work populating my database with a few thousand more records and revising my indexes.
Thanks for the help.