ansaurus

Question

Answer 1

A:

I think this may do it--don't have data set up to test the query with. Check the comments for rationale.

WITH Source AS ( 
    SELECT key1, key2, datafield, date1, 
        ROW_NUMBER() OVER(PARTITION BY key1, key2 ORDER BY date1 ASC) AS dateorder 
        FROM table 
) 
SELECT L.key1, L.key2, L.datafield, DC.datafield2 
FROM Source AS L 
LEFT JOIN AllowedDataCombinationsTable DC 
    ON DC.datafield1 = L.datafield   --  DC Alias
LEFT JOIN Source AS R 
    ON R.Key1 = L.Key1 
    AND R.Key2 = L.Key2 
    AND DC.datafield2 = R.datafield   --  Changed alias from L to R
    AND R.dateorder = 1               --  Pick out lowest one
    AND R.dateorder < L.dateorder     --  Make sure it's not the same one

Philip Kelley 2010-10-13 18:04:07

The L was a typo indeed. But this doesn't do it because R.dateorder = 1 doesn't necessarily exist, it might be filtered out by the criterium DC.datafield2 = R.datafield.

littlegreen 2010-10-13 18:18:53

True. Then the fix to that is to move that filtering criteria (along with requisite table joins) up into the WITH clause, so that only valid "candidate rows" are selected and ordered, and filter with RowNumber = 1 in the SELECT clause. Very hard to code right without data to test against, so I'm leaving this as a comment.

Philip Kelley 2010-10-13 19:24:45

Everybody is asking for data, but what would you suggest as a handy medium to transfer my data?

littlegreen 2010-10-13 20:27:09

CREATE TABLE statements (dummy table and column names are fine), and INSERT statements to populate just enough data to provide a valid test. (What you have listed looks fine.)

Philip Kelley 2010-10-13 22:44:18

Thanks, I'll keep it in mind for next time. When I have time I'll polish up this one. As I said at the other answer I solved my problem now using a nesting around my original code and choosing row_number = 1 from that, which is similar to what you are suggesting. It works (300000 rows in 9 seconds on our server) but I'm still thinking there might be an even faster way. Can't think of it though.

littlegreen 2010-10-14 21:28:48

Answer 2

A:

Well, I don't use WITH or OVER, so this is a different approach.. I might be over-simplifying something, but without having the data in front of me this is what I came up with:

SELECT distinct a.Key1, a.Key2, a.Datafield, 
       ISNULL(b.Datafield,'') as Datafield1, 
       ISNULL(b.Date,a.Date) as `Date`, 
       MIN(a.DateOrder) as DateOrder
FROM Source a
LEFT JOIN Source b 
     ON a.Key1 = b.Key1
     AND a.Key2 = b.Key2
     AND a.Dateorder <> b.Dateorder
LEFT JOIN AllowedDataCombinationsTable c
     ON a.Datafield = c.Datafield
     AND b.Datafield = c.Datafield1
GROUP BY a.Key1, a.Key2, a.Datafield, ISNULL(b.Datafield,''), ISNULL(b.Date,a.Date)

Fosco 2010-10-13 18:48:57

Hi, thanks. This is nice but I think it won't work in my case either, since I have a bunch of other datafields that I need to join as well. I'll test it and get back to you. Meanwhile I solved it using an extra nesting with row_number() around my original code in order to select the highest-ranked joined row per group, which gets the job done, but looks inefficient to me.

littlegreen 2010-10-14 21:25:31

ansaurus

tags:

views:

answers:

Self join to lowest occurrence of group

related questions