ansaurus

Question

Answer 1

+1 A:

I'm confused -- a data.frame is after all list as well. So besides the obvious

R> testdf <- data.frame(t=seq(1,5,1),e=seq(6,10,1))
R> mean(testdf)
t e 
3 8 
R> mean(mean(testdf))
[1] 5.5
R>

you could also do

R> lapply(testdf, mean)
$t
[1] 3

$e
[1] 8

R> mean(unlist(lapply(testdf, mean)))
[1] 5.5
R>

So there for the inner lapply you could use mclapply as desired, no?

Dirk Eddelbuettel 2010-08-31 21:34:00

The purpose of using mclapply would be to turn a 6 hour simulation to a 3 hour simulation so the mean(mean(test)) while elegant does not speed up the simulation. The unlist solution is precisely what I need! Thanks so much! Now I can just substitute mclapply for lapply and cut my simulation time is half!

ProbablePattern 2010-08-31 21:42:46

"Now I can just substitute mclapply for lapply and cut my simulation time is half!" Maybe. Remember there are fixed costs to parallelizing something; threads need to be initiated, etc.

Vince 2010-08-31 21:56:05

Yes, the `mean(mean(testdf))` was merely to establish the overall mean which you had not shown. I understand it was a stylized example. Glad to have been of help.

Dirk Eddelbuettel 2010-08-31 21:57:56

True, true. Burst my bubble why don't you:) I do understand that it doesn't work precisely like that but 4 cores should be substantially faster than 1 core on a 6 hour simulation.

ProbablePattern 2010-08-31 21:58:44

It all depends. For some things you may get near-linear speed-ups, for others you will not. That's what follow-up questions are for :)

Dirk Eddelbuettel 2010-08-31 22:02:42

ansaurus

tags:

views:

answers:

Extracting lapply or mclapply results

related questions