views:

86

answers:

1

Suppose I have an R data frame with columns that specify location (lat/long), height, and gender of individuals:

x <- data.frame(
  lat=c(39.5,39.51,38,38.1,38.2),
  long=c(86,86,87,87,87),
  gender=c("M","F","F","M","F"),
  height=c(72,60,61,70,80)
)

I want to bin the data in two dimensions (e.g. into 1000m x 1000m squares) and compute the following (then display on a map):

  1. What percentage of individuals in each bin are female
  2. What is the average height of males in each bin

If possible I'd like to use ggplot2.

A: 

See the cut function for a way to convert from lat/lon to bins (also look at the see also section of the help on cut for other options). This assumes that a rectangular grid is good enough, you will need to look at the spatial packages if the area is big enough for the curvature of Earth to affect things.

Then use tapply or the plyr package to compute your summaries within each cell of the grid.

Now plot the results however you want.

Greg Snow
See also the `round_any` function in reshape/plyr.
hadley