ansaurus

Question

Answer 1

A:

The first query does a SELECT with a WHERE to return only the rows that have a matching verdict and then counts those rows. The second query counts all rows every time.

You might try:

select 
  dbresultsid, TestCase, BuildID, Analyzed, Verdict, 
  Sum(case when Verdict='PASS' then 1 else 0 end) as PASS,
  Sum(case when Verdict='FAIL' then 1 else 0 end) as FAIL,
  Sum(case when Verdict='INCONC' then 1 else 0 end) as INCONC,
  Sum(case when Verdict='TIMEOUT' then 1 else 0 end) as TIMEOUT
from 
  results
where 
  BuildID = 'Beta1'
group by 
  TestCase, dbresultsid 
order by 
  Analyzed

Test data:

  CREATE TABLE #Test (BuildID Integer, Verdict char(7))
  INSERT INTO #TEST (Buildid, Verdict)
    VALUES (1, 'PASS')
  INSERT INTO #TEST (Buildid, Verdict)
    VALUES (1, 'PASS')
  INSERT INTO #TEST (BuildID, Verdict)
    VALUES (2, 'FAIL')
  INSERT INTO #TEST (BuildID, Verdict)
    VALUES (3, 'INCONC')
  INSERT INTO #TEST (BuildID, Verdict)
    VALUES(4, 'TIMEOUT')

Query:

select buildid,
  sum(case verdict when 'PASS' then 1 else 0 end) as Pass,
  sum(case verdict when 'FAIL' then 1 else 0 end) as Fail,
  sum(case verdict when 'INCONC' then 1 else 0 end) as Inconc,
  sum(case verdict when 'TIMEOUT' then 1 else 0 end) as TimeOut
FROM #temp
group by buildid

Output:

Item    buildid PASS Fail Inconc TimeOut 
1         1      2       0        0        0
2         2      0       1        0        0
3         3      0       0        1        0
4         4      0       0        0        1

Ken White 2009-11-27 15:10:56

but I want total PASS in every row not just 1

JPro 2009-11-27 15:15:42

can you tell me which one is efficient? and how to see the execution plan in MSSQL?

JPro 2009-11-27 15:18:08

I got the way to see the execution plan. But I am unable to interpret it at the moment

JPro 2009-11-27 15:22:03

Undid it - didn't notice grouping. Sorry

CodeByMoonlight 2009-11-27 15:22:50

@JPro: You get PASS every row. There can only be one Verdict per row, so you're counting it into the appropriate column (PASS, FAIL, INCONC, TIMEOUT). Each row that has a verdict of "PASS" will add a value of 1, and the SUM() (missing from my original post) will add them up. Try it.

Ken White 2009-11-27 15:25:19

@JPro: As far as most efficient, the one with the CASE will be. The first one makes four full selects against the data to return four separate result sets and count them, and then a fifth select to put them all together. The one with the CASE makes exactly one pass through the data.

Ken White 2009-11-27 15:27:06

eventhough I SUM it it displays 1 only and not the total count

JPro 2009-11-27 15:33:48

See my edit for a sample query and output.

Ken White 2009-11-27 15:34:17

It displays 1 because you are grouping by a key column.

Charles Bretana 2009-11-27 15:40:31

Answer 2

A:

Why don't you just use the first one?, In my opinion is a nice way to get your results if the keys for reference and group by are well defined. I only will change the "where" in the subselects for not repeating the key each time, making a direct reference to the main table instead.

(select count .. from results where BuildID = r1.BuildID and Verdict  ..)
from results r1

j.a.estevan 2009-11-27 15:18:52

someone suggested that query 1 is horrible way to do it

JPro 2009-11-27 15:20:31

query 1 is a horrible way to do it. Both of the other examples are correct and faster.

Hogan 2009-11-30 22:26:28

Answer 3

A:

First of all, since you mention in comment that dbresultsid is a key column, Including it in the group by clause is ineffective, you will get one output row for every row in original table (that matches your where clause).

Secondly, because the subqueries in first query are un-correlated, their output is not dependant on the row from the outer query. Therefore they will only be executed once, and the same generated value will be repeated in every outout row.

So, If the output of the first query is really what you want (where there is one row per original 'Beta1' row in results table and every row in output has the same values in the last 4 columns) then what you have is pretty close to best you can do. Just take out the group by clause - you don't need it.

Select 
  dbresultsid, TestCase, BuildID, Analyzed, Verdict,  
  z.PASS, z.FAIL, z.INCONC, z.TIMEOUT
From results r Cross Join 
  (Select 
     Sum(case when Verdict='PASS' then 1 else 0 end) PASS,
     Sum(case when Verdict='FAIL' then 1 else 0 end) FAIL,
     Sum(case when Verdict='INCONC' then 1 else 0 end) INCONC,
     Sum(case when Verdict='TIMEOUT' then 1 else 0 end) TIMEOUT
   From results Where BuildID = 'Beta1') Z
Where BuildID = 'Beta1'
Order By Analyzed

Charles Bretana 2009-11-27 15:19:32

but this query will display 1 in whereever it finds PASS, but I want total PASS count.

JPro 2009-11-27 15:21:26

No, it returns the SUM

CodeByMoonlight 2009-11-27 15:23:50

Query should be displaying the count of Pass rows with 'Beta1', and the same value for dbResultsId, Testcase, and Analyzed... unless, let me guess! Is dbResultsId a Key value ???

Charles Bretana 2009-11-27 15:27:09

HI, but adding '(Select Count(*) From Results Where BuildID = 'Beta1' And Verdict='PASS') TotalPass' is similar to Query 1, which is what I want to void scanning the table many times.

JPro 2009-11-27 15:27:12

You won't be, this is an UN-correlated sub-query, it will only be executed once.

Charles Bretana 2009-11-27 15:28:18

yes dbresultsid is key value, but I want this result:PASS FAIL INCONC20 2 4and not PASS FAIL INCONC1 0 00 1 01 0 0

JPro 2009-11-27 15:29:54

Which brings me to another point... All the oputput in your first query is from uncorrelated subqueries. They have no connection to the results table in the outer query. SO there is actually no need to even have it. except to get the values of the firts 5 columns (dbresultsid, TestCase, BuildID, Analyzed, Verdict) and the group By is doing nothing if dbresultsId is a key field...

Charles Bretana 2009-11-27 15:31:39

Every row in the first query will have the same value in all the output fields from the uncorrelated sub queries. i.e., it will not be dependant on the data in that individual row. Why do you want this data repeated for every row in the output?

Charles Bretana 2009-11-27 15:34:20

so that I can take that value just once and display it to user as total PASS. I need not count how many 1's are there for each PASS

JPro 2009-11-27 15:36:42

If the output of the first query is really what you want (where there is one row per original 'Beta1' row in results table and every row in output has the same values in the last 4 columns) then what you have is pretty close to best you can do. Just take out the group by clause - you don't need it.

Charles Bretana 2009-11-27 15:45:41

if I take out the group by, MSSQL gives an error that table.field is invalid because it is not contained in either aggregate or Groupby clause for all the fields

JPro 2009-11-27 15:53:48

The group by in your first query?? It is not required. There is nop aggregate function in the outer query in your first query.. The inner subqueries son't need a group By. In any event, please try the newly edited query in my answer above.

Charles Bretana 2009-11-27 15:58:36

JPro: your comments show me that you don't quite appreciate how the SQL Server Optimiser works. Where you Think there are multiple scans, there are not. SQL Query != Execution Plan. The oprimiser is cleverer that you give it credit for... The uncorrelated sub query should work perfectly fine.

Dems 2009-11-27 16:31:29

ansaurus

tags:

views:

answers:

difference in these queries

related questions