ansaurus

Question

Answer 1

A:

If I were after speed, the first thing I would do is go from R to C or C++. (Also Fortran, I hate to say.) When I had squeezed out every possible cycle using this technique, I would introduce MPI to take advantage of parallel hardware.

Mike Dunlavey 2010-09-21 12:54:58

I have a cluster with 20+ cores available, I want to use them. I have about 5 PhD students that spit out new exotic models at a burning rate, and all they know is R. So C / Fortran is not an option (unless I write the engine myself and call that code from within R, which is basically how R functions). Bottomline : I really need **parallel** optimization thingies in **R**.

Joris Meys 2010-09-21 13:23:15

the switching to C before going parallel in R is naive and assumes that C developers are cheaper than going parallel in R. This might be the case for @mike but clearly not the case for @joris. Any development choice is an optimization subject to a budget constraint. It's very important to be realistic about the budget constraint.

JD Long 2010-09-21 14:16:46

@Joris: @JD: Just trying to help.

Mike Dunlavey 2010-09-21 14:17:15

@Mike : for the record, I found the provided link very interesting, and I appreciate the effort you took to give me an answer. It just wasn't an answer to my question ;-).

Joris Meys 2010-09-21 14:32:56

@mike, that's totally fair. I was probably a bit terse because it's very common when an R guy asks a performance question the first response from the CS crowd is, "well you have to get out of R and get into a 'real language' like C" Which isn't particularly helpful ;) Thank you for giving some input.

JD Long 2010-09-21 18:56:13

@JD: Yeah, it's always a risk on SO, that you don't know how rigidly an OP means his/her question. The thing about interpreted languages like R is they bring people in with ease of use, and later people want performance, but you can't *really* have both very easily. Matlab has some sort of compiler, FWIW.

Mike Dunlavey 2010-09-21 19:09:15

@Mike Dunlavey : You shouldn't underestimate R, but indeed you have to know how to use it to make it perform. The main reason for using R in my case is the huge amount of statistical techniques that are readily available. And I'm not talking about calculating means ;-)

Joris Meys 2010-09-23 22:01:18

@Joris: I certainly don't wish to underestimate R. I work with people who use it heavily. The value is in what it makes easy for you, the things you can get done well in minimal code, not in it's performance. In fact our experts use it as a scripting language to fire up other tools and manipulate the results, tools like NONMEM, WinBugs, and our own language for pharmacometric mixed-effect modeling.

Mike Dunlavey 2010-09-23 22:18:18

Mike Dunlavey 2010-09-23 22:30:27

@Mike: have you taken a look at the inline package?

Joshua Ulrich 2010-10-20 12:21:33

@Joshua: I'm afraid I don't know what that is.

Mike Dunlavey 2010-10-20 13:59:24

@Mike: [inline](http://cran.r-project.org/web/packages/inline/) contains "Functionality to dynamically define R functions and S4 methods with in-lined C, C++ or Fortran code supporting .C and .Call calling conventions." Dirk Eddelbuettel has [an example](http://dirk.eddelbuettel.com/blog/2009/12/20/#rcpp_inline_example) on his blog.

Joshua Ulrich 2010-10-20 14:17:07

Answer 2

A:

Sprint might be of interest. I know nothing about it but stumbled across it recently.

High Performance Mark 2010-09-21 14:35:00

Thx for the pointer, but I knew it already. There are more frameworks for parallel computing in R, depending on the protocols you want to use. Yet, I couldn't find a -non-beta- optimization function that uses the power of parallel computing.

Joris Meys 2010-09-21 14:38:50

Answer 3

A:

To answer my own question :

There is a package in development that looks promising. It has Particle Swarm Optimization methods and builds on the Rmpi package for parallel computing. It can be found on Rforge :

http://www.rforge.net/ppso/index.html

It's still in beta AFAIK, but it looks promising. I'm going to take a look at it later on, I'll report back when I know more. Still, I leave the question open, so if anybody else has another option...

Joris Meys 2010-09-21 14:45:06

If you're considering PSO, have you thought about differential evolution (via the DEoptim package)? Parallel computing support is on the package's to-do list and shouldn't take more than a few hours of work (for me, not you :-).

Joshua Ulrich 2010-09-21 14:58:23

@Joshua Thx for the tip, I didn't know DEoptim yet. It looks promising, but for the problem I'm working on now it's actually quite slower than nlm(). I've 13 parameters and no clear lower and upper limits, so I have to set them rather big to avoid missing a parameter...

Joris Meys 2010-09-21 15:12:11

Tried the beta out and it seems to work. On my problem, it still doesn't provide the same improvement as parallelizing the function itself. Yet, I can see that in other cases this would really be a useful tool. I'm looking forward to the first stable release.

Joris Meys 2010-09-25 18:39:49

ansaurus

tags:

views:

answers:

Parallel optimization in R

related questions