ddply

R: speeding up "group by" operations

I have a simulation that has a huge aggregate and combine step right in the middle. I prototyped this process using plyr's ddply() function which works great for a huge percentage of my needs. But I need this aggregation step to be faster since I have to run 10K simulations. I'm already scaling the simulations in parallel but if this one...

how to speed up this R code

I have a data.frame (link to file) with 18 columns and 11520 rows that I transform like this: library(plyr) df.median<-ddply(data, .(groupname,starttime,fPhase,fCycle), numcolwise(median), na.rm=TRUE) according to system.time(), it takes about this long to run: user system elapsed 5.16 0.00 5.17 This...