ansaurus

Question

Reshape data frame to convert factors into columns in R

Answer 1

+3 A:

I would try to bind another column called "value" and set value = TRUE.

df <- data.frame(cbind(1:10, 2:11, 1:3))
colnames(df) <- c("ID","DATE","SECTOR")
df <- data.frame(df, value=TRUE)

Then do a reshape:

reshape(df, idvar=c("ID","DATE"), timevar="SECTOR", direction="wide")

The problem with using the reshape function is that the default for missing values is NA (in which case you will have to iterate and replace them with FALSE).

Otherwise you can use cast out of the reshape package (see this question for an example), and set the default to FALSE.

df.wide <- cast(df, ID + DATE ~ SECTOR, fill=FALSE)
> df.wide 
   ID DATE     1     2     3
1   1    2  TRUE FALSE FALSE
2   2    3 FALSE  TRUE FALSE
3   3    4 FALSE FALSE  TRUE
4   4    5  TRUE FALSE FALSE
5   5    6 FALSE  TRUE FALSE
6   6    7 FALSE FALSE  TRUE
7   7    8  TRUE FALSE FALSE
8   8    9 FALSE  TRUE FALSE
9   9   10 FALSE FALSE  TRUE
10 10   11  TRUE FALSE FALSE

Shane 2010-03-08 19:34:36

Thank you. I should've thought about creating a value column. Interestingly, the value column/reshape approach takes 1.4 seconds on 9,500 rows with 26 factor levels, whereas using iterative approach (over levels) takes only 0.6 seconds.

Alexander L. Belikoff 2010-03-08 20:35:35

Don't be tricked by these functions: the `reshape` function itself does iteration if you look at it. But it does much else besides that, which will add to the overall time. Things like `reshape` are not intended to perform better; they are there just to make data manipulation easier.

Shane 2010-03-08 20:51:00

Answer 2

+1 A:

Here's another approach using xtabs which may or may not be faster (if someone would try and let me know):

df <- data.frame(cbind(1:12, 2:13, 1:3))
colnames(df) <- c("ID","DATE","SECTOR")
foo <- xtabs(~ paste(ID, DATE) + SECTOR, df)
cbind(t(matrix(as.numeric(unlist(strsplit(rownames(foo), " "))), nrow=2)), foo)

Jonathan Chang 2010-03-08 23:29:04

ansaurus

tags:

views:

answers:

Reshape data frame to convert factors into columns in R

related questions