tags:

views:

81

answers:

3

Hi

I have a large dataset with repeat assessment across subjects. How do I go from:

subj, assessment, test1, test2
A, 1, 10, 20
A, 2, 12, 13
A, 3, 11, 12
B, 1, 14, 14
B, 2, 13, 12

To:

subj, test1_1, test1_2, test1_3
A, 10, 12, 11
B, 14, 13

Thanks,

Jon

+1  A: 

you can easily accomplish this using the excellent reshape/ reshape2 package by hadley. here is the code to take you to what you need

library(reshape); 
df = melt(df, id = c('subj', 'assessment'));
df = cast(df, subj ~ variable + assessment);

let me know if this works for you.

Ramnath
I believe that should be "variable" instead of "test" in the cast formula, like so: df = cast(df, subj ~ variable + assessment). (Unless you add variable_name = "test" to the previous line).
Fojtasek
that is right. thanks for pointing it out. i have modified the code accordingly
Ramnath
+1  A: 

The reshape function (in stats) does this fairly easily:

reshape(data, timevar='assessment', idvar='subj', dir='wide')

Or to just get the results for test1:

reshape(subset(data, select=-test2), timevar='assessment', idvar='subj', dir='wide')
Charles
A: 

Thanks! This is why OpenSource is so great. The first method (reshape{stats}) worked best for me once I figured out how to use "v.names" and "varying" (otherwise, you get a redundant all-to-all reshaping,- bad in a dataset with 200 variables).

THanks again

JOn

Jon Erik Ween