tags:

views:

117

answers:

3

Here is my Input:

ID  Color
1   green
1   red
1   orange
1   green
1   red
2   red
2   red
2   blue
3   green
3   red

Here is what I want in my output - a count of records by ID for each color:

ID  green  red  orange blue
1   2      2    1      0
2   0      2    0      1
3   1      1    0      0

I know I can get the information using proc freq, but I want to output a dataset exactly like the one I have written above. I can't seem to figure out how to make the colors the columns in this output dataset.

+3  A: 

first, generate the data.

data data;
    format ID 8. Color $8.;
    input id color;
datalines;
1   green
1   red
1   orange
1   green
1   red
2   red
2   red
2   blue
3   green
3   red
run;

next, summarize color counts by id.

proc freq data=data noprint;
    table id*color / out=freq;
run;

make the table flat.

proc transpose data=freq out=freq_trans(drop=_:);
    id color;
    by id;
    var count;
run;

optionally, fill in missing cells with 0.

data freq_trans_filled;
    set freq_trans;
    array c(*) green red orange blue;
    do i = 1 to dim(c);
        if c[i]=. then c[i]=0;
    end;
    drop i;
run;
rkoopmann
+1  A: 

You can fill the missing cells with zero's using the SPARSE option to the PROC FREQ's TABLE statement. This way, you don't need another DATA step. The order of the colors can also be controlled by the ORDER= option to PROC FREQ.

data one;
  input id color :$8.;
datalines;
1   green
1   red
1   orange
1   green
1   red
2   red
2   red
2   blue
3   green
3   red
run;
proc freq data=one noprint order=data;
  table id*color /out=freq sparse;
run;
proc transpose data=freq out=two(drop=_:);
    id color;
    by id;
    var count;
run;
proc print data=two noobs;
run;
/* on lst
id    green    red    orange    blue
 1      2       2        1        0
 2      0       2        0        1
 3      1       1        0        0
*/
Chang Chung
+1  A: 

I've never been a fan of proc transpose because I can never remember the syntax. Here's a way to do it with proc sql and a macro variable.

proc sql noprint;  
    select    sum(color = '" || trim(color) || "') as " || color into: color_list separated by ", "
    from      (select distinct color from one);  
    create table result as  
    select    id,  
              &color_list  
    from      one  
    group by  id;  
quit;



id            blue        green         orange            red  
1              0              2              1              2  
2              1              0              0              2  
3              0              1              0              1  
jgunnink