views:

72

answers:

2

Dear R-wizards,

I have the following data frame:

    Date1              Date2            Date3               Date4              Date5 
1    25 April 2005       10 May 2006   28 March 2007   14 November 2007      1 April 2008  
2    25 April 2005       10 May 2006   28 March 2007   14 November 2007      1 April 2008  
3  29 January 2008   4 December 2008    6 April 2009       1 March 2010   NA 
4  29 January 2008   4 December 2008    6 April 2009       1 March 2010   1 February 2010  
5  29 January 2008   4 December 2008    6 April 2009       1 March 2010   1 February 2010  
6  29 January 2008   4 December 2008    6 April 2009       NA             NA 

And the following vector:

   1 01/09/2004 
   2 20/03/2007 
   3 16/09/2009 
   4 16/09/2009 
   5 15/07/2008 
   6 16/09/2009

I would like to make a count of the dates in each row of the data frame that are the same or before the dates in the vector. For instance for the first row the count should be zero as all the dates are after the corresponding date in the vector.

Anyone know how this can be done?

Best, Thomas

P.S:

Here is output from the dput() command so you guys can read the data into R more easily for testing (if you want to):

Dataframe:

structure(c(" 25 April 2005 ", " 25 April 2005 ", " 29 January 2008 ", 
" 29 January 2008 ", " 29 January 2008 ", " 29 January 2008 ", 
" 10 May 2006 ", " 10 May 2006 ", " 4 December 2008 ", " 4 December 2008 ", 
" 4 December 2008 ", " 4 December 2008 ", " 28 March 2007 ", 
" 28 March 2007 ", " 6 April 2009 ", " 6 April 2009 ", " 6 April 2009 ", 
" 6 April 2009 ", " 14 November 2007 ", " 14 November 2007 ", 
" 1 March 2010 ", " 1 March 2010 ", " 1 March 2010 ", " 1 March 2010 ", 
" 1 April 2008 ", " 1 April 2008 ", " 1 February 2010 ", " 1 February 2010 ", 
" 1 February 2010 ", " 1 February 2010 "), .Dim = c(6L, 5L), .Dimnames = list(
    c("1", "2", "3", "4", "5", "6"), c("Rep1", "Rep2", "Rep3", 
    "Rep4", "Rep5")))

Vector:

c("01/09/2004", "20/03/2007", "16/09/2009", "16/09/2009", "15/07/2008", 
"16/09/2009")
+2  A: 

If the data.frame is called m and vector v, simple

rowSums(m<=v)

should do (this works because m is represented by R as a vector glued of following columns, and v will be recycled). Still, first ensure that all dates are POSIXcts or Dates; see this question for info about the conversion itself.

mbq
or all dates are `Date`'s
Marek
@Marek thanks, updated.
mbq
This doesn't work with dates. Underneath your solution there is a conversion to "double", which doesn't work on a dataframe of dates.
Joris Meys
@Joris It doesn't work with `data.frame`. With `matrix` of `Date`'s (as in `dput` results) everything works.
Marek
@Marek The matrix is not a matrix of Date's on my computer. It's a matrix of type "character". And then off course the code works, but doesn't give you the correct result. You'll have to explain me how to get a matrix of class Date. With as.Date, you end up with a vector. With matrix or as.matrix, you end up with class numeric. See also http://stackoverflow.com/questions/3599851/how-to-transform-a-dataframe-of-characters-to-the-respective-dates
Joris Meys
@Joris I answer your question about `data.frame`. In case of matrix you have to use two step `Out <- as.Date(Data,format="%d %B %Y");dim(Out)<-dim(Data)`
Marek
@Marek: OK, thx. After the whole conversion thing, the rowSums solution works.
Joris Meys
@Joris I'll check it out; for now I'll just point to your question.
mbq
A: 

First thing : You really have to transform everything to Dates, and that can be a bit tricky. I read in the matrix as Data, and the vector as vect. Then :

vect <- as.Date(vect,format="%d/%m/%Y")

# Due to the apart nature of the Date class, the normal apply-solutions 
# don't give the result you're looking for.
Data <- as.data.frame(Data)
for (i in 1:ncol(Data)){
    Data[,i] <- as.Date(Data[,i],format="%d %B %Y")
}
> apply(Data,2,"<=",vect)
      Rep1  Rep2  Rep3  Rep4
[1,] FALSE FALSE FALSE FALSE
[2,]  TRUE  TRUE FALSE FALSE
[3,]  TRUE  TRUE  TRUE FALSE
[4,]  TRUE  TRUE  TRUE FALSE
[5,]  TRUE FALSE FALSE FALSE
[6,]  TRUE  TRUE  TRUE FALSE

> rowSums(apply(Data,2,"<=",vect))
[1] 0 2 3 3 1 3
Joris Meys
For people with different than English locale I recommend look at example section in `strptime` how to deal with names of months (cause `%B` transform month name in local language).
Marek
Thanks Joris, this did the trick :)
Thomas Jensen