tags:

views:

872

answers:

4

How can dataframe column type be converted to numeric

+2  A: 

if x is the column name of dataframe dat, and x is of type factor, use as.numeric(as.character(dat$x)

Tim Paine
+3  A: 

Tim is correct and Shane has an omission. Here is another examples:

R> df <- data.frame(a = as.character(10:15))
R> df <- data.frame(df, num = as.numeric(df$a), 
                        numchr = as.numeric(as.character(df$a)))
R> df
   a num numchr
1 10   1     10
2 11   2     11
3 12   3     12
4 13   4     13
5 14   5     14
6 15   6     15
R> summary(df)
  a          num           numchr    
 10:1   Min.   :1.00   Min.   :10.0  
 11:1   1st Qu.:2.25   1st Qu.:11.2  
 12:1   Median :3.50   Median :12.5  
 13:1   Mean   :3.50   Mean   :12.5  
 14:1   3rd Qu.:4.75   3rd Qu.:13.8  
 15:1   Max.   :6.00   Max.   :15.0  
R> 

Our data.frame now has a summary of the factor column (counts) and numeric summaries of the as.numeric() --- which is wrong as it got the numeric factor levels --- and the (correct) summary of the as.numeric(as.character()).

Dirk Eddelbuettel
+1 Thanks for pointing that out. I removed it.
Shane
My pleasure. This is one of the more silly corners of the language, and I think it featured in the older 'R Gotchas' question here.
Dirk Eddelbuettel
+1  A: 

Somthing that has helped me, if you have ranges of variables to convert or just more then one you can use sapply.

A bit nonsensical but just for example:

data(cars)
cars[, 1:2] <- sapply(cars[, 1:2], as.factor)

Say columns 3, 6-15 and 37 of you dataframe need to be converted to numeric one could.

dat[, c(3,6:15,37)] <- sapply(dat[, c(3,6:15,37)], as.numeric)
Jay
+3  A: 
aL3xa