I've been mostly working in SAS of late, but not wanting to lose what familiarity with R I have, I'd like to replicate something basic I've done. You'll forgive me if my SAS code isn't perfect, I'm doing this from memory since I don't have SAS at home.
In SAS I have a dataset that roughly is like the following example (. is equivalent of NA in SAS)
A B
1 1
1 3
0 .
0 1
1 0
0 0
If the dataset above was work.foo then I could do something like the following.
/* create work.bar from dataset work.foo */
data work.bar;
set work.foo;
/* generate a third variable and add it to work.bar */
if a = 0 and b ge 1 then c = 1;
if a = 0 and b = 0 then c = 2;
if a = 1 and b ge 1 then c = 3;
if a = 1 and b = 0 then c = 4;
run;
and I'd get something like
A B C
1 1 3
1 3 3
0 . .
0 1 1
1 0 4
0 0 2
And I could then proc sort by C and then perform various operations using C to create 4 subgroups. For example I could get the means of each group with
proc means noprint data =work.bar;
by c;
var a b;
output out = work.means mean(a b) = a b;
run;
and I'd get a data of variables by groups called work.means something like:
C A B
1 0 1
2 0 0
3 2 2
4 1 0
I think I may also get a . row, but I don't care about that for my purposes.
Now in R. I have the same data set that's been read in properly, but I have no idea how to add a variable to the end (like CC) or how to do an operation on a subgroup (like the by cc command in proc means). Also, I should note that my variables aren't named in any sort of order, but according to what they represent.
I figure if somebody can show me how to do the above, I can generalize it to what I need to do.