tags:

views:

83

answers:

2

I have a dataframe in R with a vector of non-sequential numbers (data$SiteID) that i would like to map to a vector of sequential numbers (data$site) to the unique values of data$SiteID. Within each site, I would like to map data$TrtID to 0 where data$TrtID == 'control' or to the next sequential integer, for the other unique data$TrtID's:

data <- data.frame( 
    SiteID = c(1,1,1,9,'108','108','15', '15'), 
    TrtID = c('N', 'control', 'N', 'control', 'P', 'control', 'N', 'P'))

1) data$site should be c(1,1,1,2,3,3,4,4).

2) data$trt should be c(1,0,1,0,1,0,0,1)

+2  A: 

Use conversion of factors to integers:

transform(data, site=as.integer(SiteID), trt=as.integer(TrtID))

If the ordering is important, you can give specific orders to the levels:

transform(data,
  site = as.integer(factor(SiteID, unique(SiteID))),
  trt  = as.integer(factor(TrtID, unique(c('control', as.character(TrtID))))) - 1L)

Modified version grouping trt factor by site:

transform(data,
  site = as.integer(factor(site_id, unique(site_id))),
  trt  = unsplit(tapply(trt_id, site_id, function(x)
         as.integer(factor(x))), site_id) - 1L)
Charles
Thanks for introducing me to transform() and unsplit(), but why use '1L' instead of 1?
David
No probs. Thats only important if you want to keep it as an integer, otherwise you can just drop the L (and you may as well use as.numeric instead of as.integer).
Charles
+2  A: 

Just treat them as factors:

as.numeric(factor(data$SiteID, levels = unique(data$SiteID)))
[1] 1 1 1 2 3 3 4 4

and for the Trt, since you want a 0-based value, subtract one.

as.numeric(factor(data$TrtID, levels = sort(unique(data$TrtID))))-1
[1] 1 0 1 0 2 0 1 2

Notice that the levels arguments are different - Trt sorts first, which is convinient since control is alphabetically before N or P. If you want a non-standard sorting, you can just explicitly specify the levels in the order you want them.

Greg
thanks that is simple and helpful, but I am not sure what the 'levels' argument does or how to find out this information on my own since ?as.numeric and ?setMethod do not have this info.
David
?factor has what you need
John
sorry. I read that too quickly.
David
Also, I just changed the question (sorry) because I realized that what I really need is to restart the ordering of the treatments within each site.
David