views:

1562

answers:

4

I have an Oracle table with data that looks like this:

ID   BATCH   STATUS
1    1       0
2    1       0
3    1       1
4    2       0

That is, ID is the primary key, there will be multiple rows for each "batch," and each row will have a status code in the STATUS column. There are a bunch of other columns, but these are the important ones.

I need to write a query which will summarize the status codes for each batch; there are three possible values that can go in the STATUS column, 0, 1, and 2, and I would like output that looks something like this:

BATCH  STATUS0  STATUS1  STATUS2
1      2        1        0
2      1        0        0

Those numbers would be counts; for batch 1, there are

  • 2 records where STATUS is set to 0
  • 1 record where STATUS is set to 1, and
  • no records where STATUS is set to 0.

For batch 2, there is

  • 1 record where STATUS is set to 0, and
  • no records where STATUS is set to 1 or 2.

Is there a way that I can do this in one query, without having to rewrite the query for each status code? i.e. I can easily write a query like this, and run it three times:

SELECT batch, COUNT(status)
FROM table
WHERE status = 0
GROUP BY batch

I could run that, then run it again where status = 1, and again where status = 2, but I'm hoping to do it in one query.

If it makes a difference, aside from the STATUS column there is another column that I might want to summarize the same way--another reason that I don't want to have to execute SELECT statement after SELECT statement and amalgamate all of the results.

+1  A: 
select batch,
sum(select case when status = 0 then 1 else 0 end) status0,
sum(select case when status = 1 then 1 else 0 end) status1,
sum(select case when status = 2 then 1 else 0 end) status2
from table
group by batch
tekBlues
+3  A: 
select batch 
,      count(case when status=1 then 1 end) status1
,      count(case when status=2 then 1 end) status2
,      count(case when status=3 then 1 end) status3
from   table
group by batch;

This is often called a "pivot" query, and I have written an article about how to generate these queries dynamically on my blog.

Version using DECODE (Oracle-specific but less verbose):

select batch 
,      count(decode(status,1,1)) status1
,      count(decode(status,2,1)) status2
,      count(decode(status,3,1)) status3
from   table
group by batch;
Tony Andrews
+1  A: 
select batch,
sum((decode(status,0,1,0)) status0,
sum((decode(status,1,1,0)) status1,
sum((decode(status,2,1,0)) status2,
from table
group by batch
Diederik Hoogenboom
A benefit of the COUNT is that it never returns NULL so your DECODE can be even more compact: count((decode(status,0,1)) status0
Tony Andrews
A: 

OP asks if there's any performance benefit of one approach (SUM) over the other (COUNT). Running a simpleminded test on a table with 26K rows shows that the COUNT approach is significantly faster. YMMV.

DECLARE
  CURSOR B IS
     select batch_id
       FROM batch
      WHERE ROWNUM < 2000;

  v_t1  NUMBER;
  v_t2  NUMBER;
  v_c1  NUMBER;
  v_c2  NUMBER;
  v_opn INTEGER;
  v_cls INTEGER;
  v_btc VARCHAR2(100);
BEGIN
-- Loop using SUM
  v_t1 := dbms_utility.get_time;
  v_c1 := dbms_utility.get_cpu_time;
  FOR R IN B LOOP
     FOR R2 IN (SELECT batch_type_code
                     , SUM(decode(batch_status_code, 'CLOSED', 1, 0)) closed
                     , SUM(decode(batch_status_code, 'OPEN', 1, 0)) OPEN
                     , SUM(decode(batch_status_code, 'REWORK', 1, 0)) rework
                  FROM batch
                 GROUP BY batch_type_code) LOOP 
        v_opn := R2.open;
        v_cls := R2.closed;
     END LOOP;
  END LOOP;
  v_t2 := dbms_utility.get_time;
  v_c2 := dbms_utility.get_cpu_time;
  dbms_output.put_line('For loop using SUM:');
  dbms_output.put_line('CPU seconds used: '||(v_c2 - v_c1)/100);
  dbms_output.put_line('Elapsed time: '||(v_t2 - v_t1)/100);

-- Loop using COUNT
  v_t1 := dbms_utility.get_time;
  v_c1 := dbms_utility.get_cpu_time;
  FOR R IN B LOOP
     FOR R2 IN (SELECT batch_type_code
                     , COUNT(CASE WHEN batch_status_code = 'CLOSED' THEN 1 END) closed
                     , COUNT(CASE WHEN batch_status_code = 'OPEN' THEN 1 END) OPEN
                     , COUNT(CASE WHEN batch_status_code = 'REWORK' THEN 1 END) rework
                  FROM batch
                 GROUP BY batch_type_code) LOOP 
        v_opn := R2.open;
        v_cls := R2.closed;
     END LOOP;
  END LOOP;
  v_t2 := dbms_utility.get_time;
  v_c2 := dbms_utility.get_cpu_time;
  dbms_output.put_line('For loop using COUNT:');
  dbms_output.put_line('CPU seconds used: '||(v_c2 - v_c1)/100);
  dbms_output.put_line('Elapsed time: '||(v_t2 - v_t1)/100);
END;
/

This yielded the following output:

For loop using SUM:
CPU seconds used: 40
Elapsed time: 40.09
For loop using COUNT:
CPU seconds used: 33.26
Elapsed time: 33.34

I repeated the test a couple of times to eliminate any effects of caching. I also swapped the select statements. Results were similar across the board.

EDIT: this is the same test harness I used to answer a similar question with.

DCookie