views:

329

answers:

3

I am beginning some studies into machine learning and it seems these two are often used in this field. They seem very similar, so how would one decide which is best to use?

+1  A: 

There are many discussions which is better, but in case of ML the answer is simple: R is a language that was design to such tasks; also there is more and better documented ML software for R.

mbq
I find MATLAB's documentation to be excellent and very accessible
Amro
@Amro: He's speaking of ML in particular, not of documentation in general
nico
Right. I mean MATLAB has a few ML toolboxes which are documented, but most of the useful methods are either absent at all or implemented as a user-made naked scripts.Even in case of toolboxes I always have an impression that they're "GUIsh"; I mean they perform some method, plot nice graphs and provides results that are randomly formatted and hard to reuse. R is much more flexible regarding this.
mbq
I agree in that MATLAB has no one "ML Toolbox", but you can find many useful functions spread over core MATLAB and multiple supported toolboxes: Statistics, Bioinformatics, Neural Networks ... I mentioned the remark about the documentation because it is well organized; usually has a user guide, function reference and a number of use cases and examples, thus making it so easy for a newcomer to get started, compared to R (I admit I sometimes get lost learning R as a complete beginner). One last thing lacking for R is a good IDE which is a must nowadays.
Amro
It is the same in R, but the overall MATLAB ML library is a ravel of R's. On the other hand if you like GUIs and don't see the restrictions they introduce you should use Minitab, Statistica, Weka...
mbq
@Amro R has several good IDE options: many people like Emacs with ESS or Eclipse with StatET. JGR, Tinn-R and Revolution are other options. see http://www.sciviews.org/_rgui/
Jeromy Anglim
+4  A: 

"Which is better"-questions usually depend heavily on the context. This is no exception.

What do you want to be able to achieve with machine learning? If you want to learn it just for the sake of understanding machine learning, then it is best to choose the language in which you can get most support from your immediate environment. Your friends know R inside out? Choose R. Anyway, both languages allow easy enough experimentation with machine learning for you to be able to get the general idea.

If you want to get into machine learning in order to do something more specific, there will be differences. Does your machine learning task involve images? Go with Matlab, because you might want to use image processing as well. Do you want to get deep into the theory behind machine learning and use fancy statistical methods for your novel algorithm? Choose R, if you want to use their wealth of functions, or choose Matlab if its programming environment suits you better.

Jonas
Also, don't forget that the high license price for Matlab is sometimes one of the points to take into consideration
nico
@nico: In my experience, Matlab is usually "free" as well, i.e. someone else pays for it. Of course, if price plays a role, R wins out.
Jonas
@Jonas: still, someone has to pay! :) The lab were I work spent 8000 euros for Matlab/various toolkits licences + something like 800 euros/year to renew them. It's not a huge expense overall but still, nothing beats free :D :D :D
nico
Yes, but the context in well defined here. In case of image processing for machine learning, it usually boils down to user-defined operations on raw pixmaps, so MATLAB is not offering anything extra here.
mbq
@mbq: MATLAB is a very nice language if you need to perform user-defined operations on N-D arrays. And maybe all you want to do is apply standard image processing filters, which all come with the image processing toolbox. So I would think that MATLAB has an advantage there.
Jonas
No, rather things like computing by-sector histograms or image kernels. And general BLAS capabilities are equal in MATLAB and R.
mbq
I agree on BLAS capabilities. But there are many ways to extract features for ML.
Jonas
@nico: Sure, nothing beats truly free. Again, it's a matter of context, though. An experimental biology lab like mine, for example, easily spends 800 euros per week on reagents (and if you work with mice, you spend that much per day), so 800 euros per year don't even register.
Jonas
+1  A: 

I'd also say that R is better for a number of reasons. I say this having used Matlab for a number of years and having switched to R and I wish I had learned R in the first place. There is blog Abandon Matlab that lists a number of reasons why working with Matlab is sometimes very annoying. Here a there main points why R is more productive for me:

  • Matlab functions are called with inconsistent syntax across and (within) toolboxes. e.g. if I want to change my classifier in a model in R I usually only need to change the name of the function and keep the call and data intact. In Matlab this usually involves reformatting the data and a totally different function call that I have to look up from the docs.

  • R has better data structures I think the only workable construct in Matlab is the basic array and working with anything else than numeric variables is awkward, further you can't call the columns by name, but you have to use the index of the variable (hmm, was it column 33 or 34, that I wanted to plot ...) . You can't beat the data.frame in R!

  • R has a lot of useful packages for ML

  • Matlab has no named arguments to functions

Finally if you work a lot with matrices and find the Matlab syntax nicer then check out Python with Numpy and Scipy. Python also has some nice ML libraries such as PyBrain. I'm not going to compare R and Python here, because thats an entirely different question :)

Matti Pastell

related questions