views:

211

answers:

4

I have a database where I need to query to get records from one table and then lookup on another table to see if that value exists there. That table might return multiple records and I want the one with the most recent date.

So table 1 is basically:

ID (Primary Key)
Name
Test_Score

And Table 2 is

Test_Id (Primary)
Student_ID (Foreign Key)
Test_Score
Test_Date

As I'm going through the records, if the no tests exist in table 2 for the Student_id, I want to use the score in table 1, otherwise I want to use the score from table 2 with the most recent date. I have this all working in C# code, but the clients want it in a stored procedure for reporting purposes and I'm seeing some performance issues since the tables are quite large. Also, this basic example actually happens multiple times against multiple tables.

I'm sure there is an elegant way of doing this that is fast and efficient, but I can't seem to come up with anything but using a cursor.

Does someone know the straight forward solution?

+3  A: 

Not 100% percent sure about syntactical details, but something like this:

select table1.Name, ISNULL(table2.Test_Score, table1.Test_Score)
from 
  table1
  left outer join table2
    on table1.id = table2.Student_ID
    AND table2.Test_Date = (
      select max(x.Test_Date)
      from table2 x
      where table1.id = x.Student_ID
      group by x.Student_ID)

If the subquery is not allowed where it is, move it to the where clause. (Sorry, i can't try it where i am now.)

The query only works if the Test_Date is unique. If not, you get repeated results. Then you should use a Group By

select table1.Name, min(ISNULL(table2.Test_Score, table1.Test_Score))
from 
  table1
  left outer join table2
    on table1.id = table2.Student_ID
    AND table2.Test_Date = (
      select max(x.Test_Date)
      from table2 x
      where table1.id = x.Student_ID
      group by x.Student_ID)
group by table1.id, table1.Name
Stefan Steinegger
Thanks, this is looking about what I was looking for. Now I have to figure out how to get the name out of the first column in there too.
Ryan Smith
this was simple :-)
Stefan Steinegger
+2  A: 

Stefan Steinegger basically has the right answer.

Make it easier on yourself with functional decomposition: write a view that gives, for each student, the row with the most recent date.

Then outer join that to the student table (table 1 in your question), and take the test score in table 1 where there is no row in the view, using isnull or coalesce.

tpdi
+1  A: 

If you are using Sql Server 2005 then Common Table Expressions (CTEs) provide an elegant solution. You create a CTE with the most recent test score for each student and then left join to the student table. Where a result exists in the test table that is used, otherwise the score from the student table is used.

I've assumed that your tables are called Student and TestResult respectively, and also assumed that test_id is an auto incrementing ID

WITH RecentResults as (
  SELECT student_id,
     test_score
  FROM TestResult tr
  WHERE tr.test_id = (SELECT MAX(test_id) FROM TestResult WHERE student_id = tr.student_id)
)
SELECT s.ID as 'Student ID',
  isnull(rr.test_score, s.test_score)
FROM Students s LEFT JOIN RecentResults rr ON s.ID = rr.student_id

I'm unable to test the code on this machine - if you provide DB schema will be easier to refine.

Macros
this assumes that max test_id is the latest test ... which may or may not be the case
Sam Saffron
Agreed - I'm assuming that test_id is an auto inc ID and there is no prior import so the latest would be the highest id
Macros
+1  A: 

Here you go, tested solution on SQL 2005, the view can be avoided, but I agree that it improves the clarity

create table students (id int primary key, name nvarchar(50), score int) 
create table scores (test_id int, student_id int, score int, date datetime)

insert students values (1, 'bob', 1)
insert students values (2,'bill', 55)
insert scores values (22,1,88,getdate())
insert scores values (23,1,88,getdate() + 1 )
insert scores values (23,1,89,getdate() + 2 )

go

create view latest_scores
as
select scores.student_id, scores.date, score
from scores
join
(
  select student_id, date = max(date) from scores
  group by student_id 
) maxDates on maxDates.student_id = scores.student_id and maxDates.date = scores.date

go 

select Id, isnull(l.score, s.score) from students s
left join latest_scores l on student_id = id
Sam Saffron
This assumes that date will have a time element or that students will not take more than 1 test in a day which may not be the case ;)
Macros
true can be worked around if its not the case
Sam Saffron