views:

61

answers:

2

I have two columns, senderUserID and recieverUserID in Messages table. I select senderUserID and recieverUserID where current receiver sent a message to current sender in the past. I select 10 rows only each time but sometimes in this table senderUserID appears more than once when i need that only senderUserID will be unique, while recieverUserID can return as many times as it happens.

this is sample data

66622   61350
90166   79222
90176   79222
86727   80452
10888   47305
66560   79219
66622   80452
89548   14452
66622   69177
52081   79223

as you can see 66622 appears twice in senderUserID. How do i limit it to appears only once.

thanks

+4  A: 

I'd say just ignore any duplicates. That will give you less than ten results for each of your batches, but I'd prefer that to enforcing uniqueness via grouping on the database (could be expensive). The objective here is to maximize message throughput, right?

If you still want to do it in SQL:

 select senderUserId, max(receiverUserId) from messages group by senderUserId
Thilo
thanks, does what it should do...
eugeneK
your answer is better than mine :)
Alexander
Note that it is "unfair" against receivers with low ids...
Thilo
@Thilo, this query runs on request few times a day so i don't worry about IO or memory usage that much.
eugeneK
@Thilo, i'm a racist against low IDs
eugeneK
Instead of max you can use LIST function: select senderUserId, list(receiverUserId) from messages group by 1
Andrei K.
+5  A: 
;WITH cte AS
(
SELECT senderUserID, 
       recieverUserID,
       ROW_NUMBER() OVER (PARTITION BY senderUserID ORDER BY recieverUserID) AS RN
FROM YourTable
)
SELECT senderUserID,recieverUserID FROM cte 
WHERE RN=1
Martin Smith
Seems to work. If eugeneK wants to select 10 records only, he should change the last rows to: SELECT TOP 10 senderUserID,recieverUserID FROM cte WHERE RN=1
devmake
thanks @Martin Smith... Thilo's query works faster with my dataset even though your solution is works just fine...
eugeneK