tags:

views:

99

answers:

2
+2  Q: 

Rearrange data [R]

I haven't quite got my head around R and how to rearrange data. I have an old SPSS data file that needs rearranging so I can conduct an ANOVA in R.

My current data file has this format:

ONE <- matrix(c(1, 2, 777.75, 609.30, 700.50, 623.45, 701.50, 629.95, 820.06, 651.95,"nofear","nofear"), nr=2,dimnames=list(c("1", "2"), c("SUBJECT","AAYY", "BBYY", "AAZZ", "BBZZ", "XX")))

And I need to rearrange it to this:

TWO <- matrix(c(1, 1, 1, 1, 2, 2, 2, 2, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 777.75, 701.5, 700.5, 820.06, 609.3, 629.95, 623.95, 651.95), nr=8, dimnames=list(c("1", "1", "1", "1", "2", "2", "2", "2"), c("SUBJECT","AA", "ZZ", "XX", "RT")))

I am sure that there is an easy way of doing it, rather than hand coding. Thanks for the consideration.

+6  A: 

This should do it. You can tweak it a bit, but this is the idea:

library(reshape)
THREE <- melt(as.data.frame(ONE),id=c("SUBJECT","XX"))
THREE$AA <- grepl("AA",THREE$variable)
THREE$ZZ <- grepl("ZZ",THREE$variable)
THREE$variable <- NULL

# cleanup
THREE$XX <- as.factor(THREE$XX)
THREE$AA <- as.numeric(THREE$AA)
THREE$ZZ <- as.numeric(THREE$ZZ)
Joris Meys
This has helped me. Many thanks.
RSoul
+1 smart. Thanks for introducing grepl
Brandon Bertelsen
+2  A: 

Reshape and reshape() both help with this kind of stuff but in this simple case where you have to generate the variables hand coding is pretty easy, just take advantage of automatic replication in R.

TWO <- data.frame(SUBJECT = rep(1:2,each = 4),
                  AA = rep(1:0, each = 2),
                  ZZ = 0:1,
                  XX = 1,
                  RT = as.numeric(t(ONE[,2:5])))

That gives the TWO you asked for but it doesn't generalize to a larger ONE easily. I think this makes more sense

n <- nrow(ONE)
TWO <- data.frame(SUBJECT = rep(ONE$SUBJECT, 4),
                  AB = rep(1:0, each = n),
                  YZ = rep(0:1, each = 2*n),
                  fear = ONE$XX,
                  RT = unlist(ONE[,2:5]))

This latter one gives more representative variable names, and handles the likely case that your data is actually much bigger with XX (fear) varying and more subjects. Also, given that you're reading it in from an SPSS data file then ONE is actually a data frame with numeric numbers and factored character columns. The reshaping was only this part of the code...

TWO <- data.frame(SUBJECT = rep(ONE$SUBJECT, 4),
                  fear = ONE$XX,
                  RT = unlist(ONE[,2:5]))

You could add in other variables afterward.

John
That's very interesting. Thanks.
RSoul