tags:

views:

223

answers:

5

My application has a table that contains snapshot inventory data from each year. For example, there's a vehicle inventory table with the typical columns vehicle_id, vehicle_plate_num, vehicle_year, vehicle_make, etc, but also the year designating that the vehicle is owned.

Querying the entire table might result in something like this:

Id  Plate Num   Year Make     Model    Color   Year Owned
---------------------------------------------------------
1   AAA555      2008 Toyota   Camry    blue    2009
2   BBB666      2007 Honda    Accord   black   2009
3   CCC777      1995 Nissan   Altima   white   2009
4   AAA555      2008 Toyota   Camry    blue    2010
5   BBB666      2007 Honda    Accord   black   2010
6   DDD888      2010 Ford     Explorer white   2010

(Good or bad, this table already exists and it is not an option to redesign the table and that's a topic for another question). What you see here is year after year, the majority of the vehicles are still in the inventory, but there's always the situation where old ones are getting rid of, and new vehicles are acquired. In the example above, the 1995 Nissan Altima was in the 2009 inventory but no longer in the 2010 inventory. The 2010 inventory has a new 2010 Ford Explorer.

How can I build an efficient query that takes any two years and show only the difference. For example, if I pass in 2009, 2010, the query should returns

3   CCC777  1995  Nissan  Altima      white   2009

If I pass in 2010, 2009, the query should return

6   DDD888   2010  Ford   Explorer   white   2010

Edit: I should have added the comment following the answer from Kyle B., but the text area for comment is not very user-friendly:

I didn't think it would be this tough, but seems to be.

Anyway, wouldn't you need a sub-select from the above like this:

select q.* from (
    select f.*
    from inventory f 
      left join inventory s
      on (f.plate_num = s.plate_num 
         and f.year_owned = :first-year
         and s.year_owned = :second-year)
    where s.plate_num is null
) q
where q.year_owned = :second_year
A: 
select a.id, a.platenum, a.year, a.make, a.model, a.color, b.yearowned
from inventory a
join inventory b on a.platenum=b.platenum
where a.yearowned=___ and b.yearowned=___;

Edit: oops, I misunderstood. How do I delete my answer?

Ken
A: 

This query will select all cars from 2010 that did not exist in the table in previous years.

select * 
from cars
where Year_Owned = 2010
  and plate not in (
        select plate 
        from cars 
        where year_owned < 2010);

Using this structure, it should be obvious how to rearrange it to produce the cars that no longer exist in 2010.

ar
+4  A: 

You want a self-outer join

It looks like you want the asymmetric difference. If you wanted the symmetric difference, you'd use a full outer join instead of a left (or right) outer join.

With variables :first-year and :second-year

select f.*
   from inventory f 
     left join inventory s
        on (f.plate_num = s.plate_num
           and s.year_owned = :second-year)
where s.plate_num is null 
  and f.year_owned = :first-year

Note that the condition has to be inside the join condition, so that database will return a null row when there's no match instead of finding a match that later gets removed by filtering.

Edit: Adjusted query slightly. This doesn't require a sub-select. Tested with postgresql.

Kyle Butt
absolutely the right idea. Though, one should move the 'f.year_owned' out of the 'join' condition into a 'where' clause.
van
Hm, I'm not so sure if that would work.
cathat
Van's right. I need to move the qualification on f out of the join clause.
Kyle Butt
A: 

I am not sure how 'efficient' this idea is going to be; however you can probably use the 'EXCEPT' SQL statement. Just a sample, this won't return the complete row you want, however you will get the idea:

select plate, name from inventory where year_owned=2009
except
select plate, name from inventory where year_owned=2010
ondra
A: 

I think Kyle Butt gave the almost perfect answer. He got me 90% of the way.

Here's the answer:

Query all vehicles that are in 2010's but NOT in 2009's inventory:

select q.* from (
    select f.* from inventory f 
        left join inventory s
        on (f.plate_num = s.plate_num 
            and f.year_owned = 2010
            and s.year_owned = 2009)
        where s.plate_num is null
    ) q
where q.year_owned = 2010

Query all vehicles that are in 2009's but NOT in 2010's inventory:

select q.* from (
    select f.* from inventory f 
        left join inventory s
        on (f.plate_num = s.plate_num 
            and f.year_owned = 2009
            and s.year_owned = 2010)
        where s.plate_num is null
    ) q
where q.year_owned = 2009
  1. Note the sub-query
  2. Runs fairly fast for 100,000+ records.
cathat
I adjusted my answer, you don't need a sub-select.
Kyle Butt
Thanks Kyle, I prefer your newly adjusted query (without the sub query). I tested and it works against SQL Server as well.
cathat