views:

144

answers:

2

I'm working with a Sybase 12.5 server and I have a table defined as such:

CREATE TABLE SomeTable(
    [GroupID] [int] NOT NULL,
    [DateStamp] [datetime] NOT NULL,
    [SomeName] varchar(100),
    PRIMARY KEY CLUSTERED (GroupID,DateStamp)
)

I want to be able to list, per [GroupID], only the latest X records by [DateStamp]. The kicker is X > 1, so plain old MAX() won't cut it. I'm assuming there's a wonderfully nasty way to do this with cursors and what-not, but I'm wondering if there is a simpler way without that stuff.

I know I'm missing something blatantly obvious and I'm gonna kick myself for not getting it, but .... I'm not getting it. Please help.

Is there a way to find TOP X records, but with grouped data?

+1  A: 

Here's quite an unscalable way!

SELECT GroupID, DateStamp, SomeName
FROM SomeTable ST1
WHERE X < 
    (SELECT COUNT(*) 
     FROM SomeTable ST2 
     WHERE ST1.GroupID=ST2.GroupID AND ST2.DateStamp > ST1.DateStamp)

Edit Bill's solution is vastly preferable though.

Martin Smith
+3  A: 

According to the online manual, Sybase 12.5 supports WINDOW functions and ROW_NUMBER(), though their syntax differs from standard SQL slightly.

Try something like this:

SELECT SP.*
FROM (
    SELECT *, ROW_NUMBER() OVER (windowA ORDER BY [DateStamp] DESC) AS RowNum
    FROM SomeTable
    WINDOW windowA AS (PARTITION BY [GroupID])
) AS SP
WHERE SP.RowNum <= 3
ORDER BY RowNum DESC;

I don't have an instance of Sybase, so I haven't tested this. I'm just synthesizing this example from the doc.


I made a mistake. The doc I was looking at was Sybase SQL Anywhere 11. It seems that Sybase ASA does not support the WINDOW clause at all, even in the most recent version.

Here's another query that could accomplish the same thing. You can use a self-join to match each row of SomeTable to all rows with the same GroupID and a later DateStamp. If there are three or fewer later rows, then we've got one of the top three.

SELECT s1.[GroupID], s1.[Foo], s1.[Bar], s1.[Baz]
FROM SomeTable s1
LEFT OUTER JOIN SomeTable s2
  ON s1.[GroupID] = s2.[GroupID] AND s1.[DateStamp] < s2.[DateStamp]
GROUP BY s1.[GroupID], s1.[Foo], s1.[Bar], s1.[Baz]
HAVING COUNT(*) < 3
ORDER BY s1.[DateStamp] DESC;

Note that you must list the same columns in the SELECT list as you list in the GROUP BY clause. Basically, all columns from s1 that you want this query to return.

Bill Karwin
If this works, it seems best since its a single query.
Justin
Many thanks. I'll try this when I get into work Friday and see if it works.
Chris Simmons
This doesn't seem to be working: "Incorrect syntax near the keyword 'OVER'." I'll keep trying, though.
Chris Simmons
My apologies, I think I led you on a wild goose chase. I've added a different solution above.
Bill Karwin
Ugh, not your doing, but the table I need to query is relatively large (~500K rows) and appears not to have the necessary indexes. Queries simply timeout. However, what you've posted makes sense. I think this is what I was looking for. Unfortunately, I won't be able to use it. Thanks so much for your answer!
Chris Simmons
Yes, without indexes appropriate to the queries you need to run, a database of any nontrivial size is practically unusable. You should use `set showplan on` to guide you to create the best indexes. See chapter 36 of *ASE Performance and Tuning Guide*.
Bill Karwin