ansaurus

Question

Answer 1

+2 A:

since after melting, all measure variables will be in the same column, they should be of same type. In your case, "team" are character, "goals" are numeric, so you got that error.

Gary Lee 2010-09-18 04:01:22

Answer 2

+2 A:

I think you'd be better off using ddply from the plyr package for this problem. You didn't say how you wanted to summarise the data, but check out the summarise functions if you want to use a different summary function for each variable, or the colwise function if you want to summarise all variables the same way.

hadley 2010-09-18 12:15:26

As always, thanks for the advice hadley. I can't quite get my head around what I would summarize. I edited the post above to highlight what I am hoping the new data frame will look like. I had previously tried using sqldf and almost got it but figure there must be an easier way with some of your packages.

Btibert3 2010-09-18 15:20:19

Answer 3

A:

Thanks for the help. I ended up going a different route and broke the problem into little pieces. I am sure this is quicker, more elegant way, but I got to where I needed to be and wanted to share the code in case this helps someone else.

## load libraries 
library(sqldf)

## assume that the dataset is loaded
## restructure the data and merge together
sql.1 <- "SELECT gameid, period, team `vis_team`, goals `vis_goals`, shots `vis_shots`"
sql.2 <- "FROM per WHERE home_ind='V' GROUP BY gameid, period "
sql.cmd <- paste(sql.1, sql.2, sep="")
vis <- sqldf(sql.cmd)

sql.1 <- "SELECT gameid, period, team `home_team`, goals `home_goals`, shots `home_shots`"
sql.2 <- "FROM per WHERE home_ind='H' GROUP BY gameid, period "
sql.cmd <- paste(sql.1, sql.2, sep="")
home <- sqldf(sql.cmd)

my.dataset <- merge(vis, home)

Btibert3 2010-09-18 23:33:35

Answer 4

+1 A:

Now I see what you're trying to do, here's an approach using summarise from plyr:

home <- summarise(subset(per, home_ind == "V"), 
  gameid = gameid, period = period, 
  vis_team = team, vis_goals = goals, vis_shots = shots)

away <- summarise(subset(per, home_ind == "H"), 
  gameid = gameid, period = period, 
  home_team = team, home_goals = goals, home_shots = shots)

join(home, away)

There are also a number of ways to do it using just base functions (e.g. by subsetting and then modifying names)

hadley 2010-09-19 14:20:32

ansaurus

tags:

views:

answers:

Reshape error - invalid factor

related questions