views:

80

answers:

2

Given this data on SQL Server 2005:

SectionID Name  
1         Dan  
2         Dan  
4         Dan  
5         Dan  
2         Tom  
7         Tom  
9         Tom  
10        Tom  

How would I select records where the sectionID must be +-2 or more from another section for the same name.

The result would be:

1 Dan  
4 Dan  
2 Tom  
7 Tom  
9 Tom

Thanks for reading!

+3  A: 
SELECT *
FROM mytable a
WHERE NOT EXISTS
  (SELECT *
  FROM mytable b
  WHERE a.Name = b.Name
  AND a.SectionID = b.SectionID + 1)
Anthony Faull
+1 Straight-forward and correct solution. A composite index on `(SectionID, Name)` will be greatly beneficial to this query.
Tomalak
so if you had data:`1 Tom2 Tom3 Tom4 Tom`you would get zero records back. Which I suspect is not correct.
ninesided
@ninesided, not really - the query returns all the records from table a where for the sectionid there is no record in the same table with sectionid that is less by one (so in your counter example 1 Tom would be returned).
Unreason
fair point, but it's still wrong
ninesided
@ninesided - how so? can you provide a case when it does not work or works contrary to what was requested?
Unreason
my understanding of the question as stated was that it should return all SectionID's that are +/-2 of other records with the same Name. That would imply that for my example the correct result would be "1 Tom", "3 Tom" (or maybe "2 Tom", "4 Tom" depending on the ordering).
ninesided
Anthony's solution works! I was overthinking this problem. I tried to break it too with different sorts and it's still solid.SELECT * FROM (select top 100 percent * from #tmp order by NewID()) a WHERE NOT EXISTS (SELECT * FROM (select top 100 percent * from #tmp order by NewID()) b WHERE a.Name = b.Name AND a.SectionID = b.SectionID + 1)
Daniel
@Daniel, I don't understand how this solves your problem. Given "1 Tom", "2 Tom", "3 Tom", Anthony's solution will give a result of "1 Tom" whereas your question indicates the correct solution would be "1 Tom", "3 Tom". Can you clarify this for me please?
ninesided
@ninesided - yes that is correct, but as long as it's +-2 or more, it won't break my application. this is more of a very strange bit of business logic than it is an sql riddle. i should have given more detail at first.does your partition solution solve this for sequential problem, thereby including the maximum number of records? i couldn't get it to work, i'll look at it again. thanks for your good thoughts.
Daniel
@daniel, partition is mine and sorry if it does not work, i tested it on postgres; it does not return strict maximum, but it returns more records then the above (i could check for a maximum if you are interested, it might be possible with straight sql); also since it was not mentioned earlier - this kind of task might be better served at client side or as stored procedure, but i guess you know that.
Unreason
A: 

Here's LEFT JOIN variant of Anthony's answer (removes consecutive id's from the results)

SELECT a.*
FROM mytable a 
     LEFT JOIN mytable b ON a.Name = b.Name AND a.SectionID = b.SectionID + 1
WHERE b.SectionID IS NULL

EDIT: Since there is another interpretation of the question (simply getting results where id's are more than 1 number apart) here is another attempt at an answer:

WITH alternate AS (
SELECT sectionid,
       name,
       EXISTS(SELECT a.sectionid 
              FROM mytable b 
              WHERE a.name = b.name AND 
                    (a.sectionid = b.sectionid-1 or a.sectionid = b.sectionid+1)) as has_neighbour,
       row_number() OVER (PARTITION by a.name ORDER BY a.name, a.sectionid) as row_no
FROM mytable a
)
SELECT sectionid, name 
FROM alternate
WHERE row_no % 2 = 1 OR NOT(has_neighbour) 
ORDER BY name, sectionid;

gives:

 sectionid | name 
-----------+------
         1 | Dan
         4 | Dan
         2 | Tom
         7 | Tom
         9 | Tom

Logic: if a record has neighbors with same name and id+/-1 then every odd row is taken, if it has no such neighbors then it gets the row regardless if it is even or odd.

As stated in the comment the condition is ambiguous - on start of each new sequence you might start with odd or even rows and the criteria will still be satisfied with different results (even with different number of results).

Unreason
Different, but still has the same issue that @ninesided mentioned.
PaulG
Bad idea since SQL server will not be able to convert this to an antijoin.
erikkallen
@PaulG, can you explain the issue; I asked for more details on ninesided comment under Anthony's answer but none were provided (as far as i can see both queries return the requested data)@erikkallen, I agree it would not, but the performance should be tested, the NOT EXISTS of correlated subquery performance is not obviously superior to LEFT JOIN performance, even with such high selectivity.
Unreason
My understanding of the original question was that any row with an Id +/-2 from another Id with the same name should be returned. Therefore if your data was 1-Tom, 2-Tom, 3-Tom you should expect 1-Tom and 3-Tom to be returned (given that 3 is +/- 2 away from 1)
PaulG
@PaulG, ah I see your logic. mine went into "+-2 or more from another **section**"; where I presume section to be non-interrupted sequence. Better go back to OP to clarify.
Unreason
Yes you are right PaulG ID +-2 inclusive.
Daniel
This solution works also. Thanks!
Daniel