tags:

views:

162

answers:

1

I've got a dataset that looks like this...

mine tonnes week
AA   112    41
AA   114    41
AA   119    41
BB   108    41 
BB   112    41
AA   110    42
AA   109    42
AA   102    43
AA   101    43

And I want to create a boxplot in ggplot2 to show the distribution of tonnes for each week. But I only want results from mine AA.

I thought it would work like this....

qplot(factor(week), tonnes[mine == "AA"], data = sql_results, geom = "boxplot")

But instead, I get this error.

Error in data.frame(x = c(13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L,  :

  arguments imply differing number of rows: 423100, 109436

It's probably dead simple, but I'm not having much luck figuring the right way to do this.

+4  A: 

close. In your example you created a subset of tonnes, but not of week.

sql_results<-structure(list(mine = structure(c(1L, 1L, 1L, 2L, 2L, 1L, 1L, 
1L, 1L), .Label = c("AA", "BB"), class = "factor"), tonnes = c(112, 
114, 119, 108, 112, 110, 109, 102, 101), week = c(41, 41, 41, 
41, 41, 42, 42, 43, 43)), row.names = c("1", "2", "3", "4", "5", 
"6", "7", "8", "9"), .Names = c("mine", "tonnes", "week"), class = "data.frame")

qplot(factor(week), tonnes, data = subset(sql_results,mine=="AA"), geom = "boxplot")
Ian Fellows
brilliant. thanks :)
Tommy O'Dell