views:

97

answers:

2

Dear R-experts,

The following R code generates a snippet from data frame I am working with at the moment:

rep1 <- c("20/02/01","23/03/02")
rep2 <- c(NA, "03/05/02")
rep3 <- c("16/04/01",NA)
rep4 <- c(NA,"12/02/03")
data <- data.frame(rep1 = rep1, rep2 = rep2, rep3 = rep3, rep4 = rep4)

The data frame generated by the code looks like this:

      rep1     rep2     rep3     rep4
1 20/02/01     <NA> 16/04/01     <NA>
2 23/03/02 03/05/02     <NA> 12/02/03

I would like to rearrange this data frame so it looks like this:

      rep1     rep2   rep3     rep4
1 20/02/01 16/04/01    <NA>     <NA>
2 23/03/02 03/05/02   12/02/03   <NA> 

That is, for every row I would like to replace every NA with the next entry in the row, untill there are only NAs left in the row.

The true data frame consists of many thousand rows, so doing this by hand would mean many late hours in the office.

If anyone could tell me how to do this in R, I would be most grateful!

Thomas

+1  A: 

I'm not sure I understand, but it seems you want to move the NA's to the end columns? Here is one way (done quickly; there may be a cleaner way):

> d <- data.frame(rbind(c(1, 2, NA, 4, NA, 6), c(NA, 2, 3, 4, 5, 6)))
> d
  X1 X2 X3 X4 X5 X6
1  1  2 NA  4 NA  6
2 NA  2  3  4  5  6
> t(apply(d, 1, function(x) c(x[!is.na(x)], rep(NA, sum(is.na(x))))))
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    2    4    6   NA   NA
[2,]    2    3    4    5    6   NA

On your data:

> t(apply(data, 1, function(x) c(x[!is.na(x)], rep(NA, sum(is.na(x))))))
     [,1]       [,2]       [,3]       [,4]
[1,] "20/02/01" "16/04/01" NA         NA  
[2,] "23/03/02" "03/05/02" "12/02/03" NA  
Vince
Thanks Vince, I have specified the question a bit more, but this was exactly what I was after!
Thomas Jensen
When applied on vectors, `rbind` returns object of class `matrix`, so there's no need to convert it to `data.frame` in order to do run `apply` over it, because `apply` will return `matrix` when run on the `data.frame`. Try `t(apply(data, 1, sort, na.last = TRUE))` if you want to sort the values...
aL3xa
A dataframe was used intentionally, as that is what the OP's problem begins with.
Vince
Oh, right... =) Self-handicapping moments strike again!
aL3xa
A: 

Following Vince's suggestion, but perhaps a little cleaner:

t(apply(d, 1, function(x) x[order(x)]))
Eduardo Leoni
This will change original order. But if you do `t(apply(data, 1, function(x) x[order(is.na(x))]))` then should be ok.
Marek