




In R, I have a data-frame of various statistics recorded throughout the day. (For example, heart-rate) . The time-stamps for each measurement-entry are automatically created, and I have already converted them into a POSIXt class element.

The number of observations varies from day to day.

I am wondering how I can calculate summary statistics by day/week/month.

+2  A: 

Use tapply and format.


> tst<-data.frame(date=as.POSIXct(runif(1000)*31557600,origin="2010/8/9"),value=runif(1000))

> tapply(tst$value,format(tst$date,"%a"),summary)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.001545 0.238900 0.499600 0.484700 0.697000 0.996400 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.02029 0.25100 0.49100 0.49910 0.75530 0.99120 

    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.003557 0.245600 0.493600 0.499200 0.754600 0.996200 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
0.01867 0.22340 0.52750 0.51260 0.80500 0.97760 

    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.003691 0.281200 0.600600 0.546800 0.790800 0.973000 

    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.009304 0.253400 0.488900 0.510300 0.772200 0.997100 

    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.002854 0.236200 0.400600 0.473500 0.742900 0.988600

You can replace the %a in format with other codes to suit, see ?strptime. Month is %b and weeknumber is %U.

That did the trick. Thank you.
CG Nguyen
No problem. For more advanced breakdowns the `ddply` function of the `plyr` package is useful.
Nice - I like your approach of reformatting the date to get the necessary grouping variable.
Matt Parker
+2  A: 

You could try something like this to get summary statistics by month for the second column of your dataframe

dlply(my_dataframe,.(format(date_Column, "%m %y")),function(x) basicStats(x[2])) 