tags:

views:

330

answers:

5

Hey,

I need to make a rather complex query, and I need help bad. Below is an example I made.

Basically, I need a query that will return one row for each case_id where the type is support, status start, and date meaning the very first one created (so that in the example below, only the 2/1/2009 John's case gets returned, not the 3/1/2009). The search needs to be dynamic to the point of being able to return all similar rows with different case_id's etc from a table with thousands of rows.

There's more after that but I don't know all the details yet, and I think I can figure it out if you guys (an gals) can help me out here. :)

ID   |  Case_ID   |   Name   |   Date   |  Status  |  Type   

48   |    450     |   John   | 6/1/2009 |  Fixed   | Support   
47   |    450     |   John   | 4/1/2009 |  Moved   | Support   
46   |    451     |   Sarah  | 3/1/2009 |          |           
45   |    432     |   John   | 3/1/2009 |  Fixed   | Critical  
44   |    450     |   John   | 3/1/2009 |  Start   | Support 
42   |    450     |   John   | 2/1/2009 |  Start   | Support   
41   |    440     |   Ben    | 2/1/2009 |          |           
40   |    432     |   John   | 1/1/2009 |  Start   | Critical  
...

Thanks a bunch!

Edit:

To answer some people's questions, I'm using SQL Server 2005. And the date is just plain date, not string.

Ok so now I got further in the problem. I ended up with Bliek's solution which worked like a charm. But now I ran into the problem that sometimes the status never starts, as it's solved immediately. I need to include this in as well. But only for a certain time period.

I imagine I'm going to have to check for the case table referenced by FK Case_ID here. So I'd need a way to check for each Case_ID created in the CaseTable within the past month, and then run a search for these in the same table and same manner as posted above, returning only the first result as before. How can I use the other table like that?

As usual I'll try to find the answer myself while waiting, thanks again!

Edit 2:

Seems this is the answer. I don't have access to the full DB yet so I can't fully test it, but it seems to be working with the dummy tables I created, to continue from Bliek's code's WHERE clause:

WHERE RowNumber = 1 AND Case_ID IN (SELECT Case_ID FROM CaseTable
      WHERE (Date BETWEEN '2007/11/1' AND '2007/11/30'))

The date's screwed again but you get the idea I'm sure. Thanks for the help everyone! I'll get back if there're more problems, but I think with this info I can improvise my way through most of the SQL problems I currently have to deal with. :)

+4  A: 

Maybe something like:

select Case_ID, Name, MIN(date), Status, Type 
from table
where type = 'Support'
 and status = 'Start'
group by Case_ID, Name, Status, Type

EDIT: You haven't provided a lot of details about what you really want, so I'd suggest that you read all the answers and choose one that suits your problem best. So far I'd say that Tomalak's answer is closest to what you're looking for...

Mr. Brownstone
+1 for being fast typer... In 22 years with a computer, I would think that I could type like lightning, but no... Hunt and Peck, Hunt and Peck.
StingyJack
I think I it's not the typing; I probably just saw it first, it was asked 15 mins ago...
Mr. Brownstone
-1 for not actually selecting the row with the minimum date. This would break if the wrong name got entered since it would then have two rows for the same case number.
tvanfosson
Or even if two names on the one case was a valid case.
cletus
I wondered about the selection- you need to update this with a subquery to ensure that only the min(date) for each ID/Name is included in the query.
Dave Swersky
@Mr. Brownstone: This is not returning the earliest record fulfilling the given conditions. What if I was interested in the record ID that you conveniently leave out in your query?
Tomalak
@Tomalak: it just says 'return one row for each Case_ID'; it does not say anything about wanting the ID number. In fact, Case_ID and Date alone would satisfy the question.
Jonathan Leffler
@Jonathan Leffler: Hm... You may be right. I read it as "the one row that has the earliest date", because usually this is the desired result for this type of question. Especially since the OP said something about "a rather complex query", but a trivial group by isn't complex.
Tomalak
@Tomalak: yes, my answer assumes there can't be more than one Name for the same Case_ID. But the question doesn't state clearly what kind of columns the result should have and what are the constraints the data in the table might have.
Mr. Brownstone
+3  A: 
SELECT
  c.ID,
  c.Case_ID,
  c.Name,
  c.Date,
  c.Status,
  c.Type
FROM
  CaseTable c
WHERE
  c.Type = 'Support'
  AND c.Status = 'Start'
  AND c.Date = (
    SELECT MIN(Date) 
    FROM CaseTable
    WHERE Case_ID = c.Case_ID AND Type = c.Type AND Status = c.Status)
/* GROUP BY only needed when for a given Case_ID several rows 
   exist that fulfill the WHERE clause */
GROUP BY
  c.ID,
  c.Case_ID,
  c.Name,
  c.Date,
  c.Status,
  c.Type

This query benefits greatly from indexes on the Case_ID, Date, Status and Type columns.

Added value though the fact that the filter on Support and Status only needs to be set in one place.

As an alternative to the GROUP BY clause, you can do SELECT DISTINCT, which would increase readability (this may or may not affect overall performance, I suggest you measure both variants against each other). If you are sure that for no Case_ID in your table two rows exist that have the same Date, you won't need GROUP BY or SELECT DISTINCT at all.

Tomalak
A: 

Don't apologize for your date formatting, it makes more sense that way.

    SELECT ID, Case_ID, Name, MIN(Date), Status, Type
    FROM caseTable
    WHERE Type = 'Support'
       AND status = 'Start'
    GROUP BY ID, Case_ID, Name, Status, Type
StingyJack
Adding an "ID" column to a group by statement doesn't help if the ID column is unique
Tawani
@StingyJack: "it makes more sense that way"? Why?
Tomalak
@Tomalak - The ascending chronology of Euro dates: smallest unit to largest unit makes more sense to me than the US Style, but I still like Imperial measure better than metric. Go figure. @Tawani, Brownstone had a better answer, so I upvoted it and left this. Probably should have deleted it. Thanks
StingyJack
@StingyJack: I tend to look at date formats (in data processing, at least) more in a "Does it sort?" way. European dates don't, that was the reason for the my comment. (For day to day usage outside of the programming, I prefer the European format as well.)
Tomalak
+1  A: 

In SQL Server 2005 and beyond I would use Common Table Expressions (CTE). This offers lots of possibilities like so:

With ResultTable (RowNumber
                 ,ID
                 ,Case_ID
                 ,Name
                 ,Date
                 ,Status
                 ,Type)
AS
(
    SELECT Row_Number() OVER (PARTITION BY Case_ID
                                  ORDER BY Date ASC)

          ,ID
          ,Case_ID
          ,Name
          ,Date
          ,Status
          ,Type
      FROM CaseTable
     WHERE Type   = 'Support'
       AND Status = 'Start'
)
    SELECT ID
          ,Case_ID
          ,Name
          ,Date
          ,Status
          ,Type
      FROM ResultTable
     WHERE RowNumber = 1
Bliek
Works like a charm! I'm going to save this query up, I can definitely see a need for it in the future. Thanks a bunch!
A: 

Deleted

Please, can you move extra info like this directly to the question. The lower section is for actual answers only. ;-)
Tomalak
Ohm BTW: I think you can actually delete your own stuff, there should be a link for this. Sorry to be nagging. :-)
Tomalak
No need to apologize, I'm new to the place so I appreciate all the tips I can get. Trust me, I'm thankful, not offended. :) Haven't registered yet so maybe that's the reason I can't find a way to delete this.
Yeah, this may be it. I wasn't sure which conditions need to be met for a new user to actually see the "delete" link, but I thought everyone was in control of their own stuff. Welcome to SO, by the way. :)
Tomalak