I am beginning some studies into machine learning and it seems these two are often used in this field. They seem very similar, so how would one decide which is best to use?
views:
329answers:
3There are many discussions which is better, but in case of ML the answer is simple: R is a language that was design to such tasks; also there is more and better documented ML software for R.
"Which is better"-questions usually depend heavily on the context. This is no exception.
What do you want to be able to achieve with machine learning? If you want to learn it just for the sake of understanding machine learning, then it is best to choose the language in which you can get most support from your immediate environment. Your friends know R inside out? Choose R. Anyway, both languages allow easy enough experimentation with machine learning for you to be able to get the general idea.
If you want to get into machine learning in order to do something more specific, there will be differences. Does your machine learning task involve images? Go with Matlab, because you might want to use image processing as well. Do you want to get deep into the theory behind machine learning and use fancy statistical methods for your novel algorithm? Choose R, if you want to use their wealth of functions, or choose Matlab if its programming environment suits you better.
I'd also say that R is better for a number of reasons. I say this having used Matlab for a number of years and having switched to R and I wish I had learned R in the first place. There is blog Abandon Matlab that lists a number of reasons why working with Matlab is sometimes very annoying. Here a there main points why R is more productive for me:
Matlab functions are called with inconsistent syntax across and (within) toolboxes. e.g. if I want to change my classifier in a model in R I usually only need to change the name of the function and keep the call and data intact. In Matlab this usually involves reformatting the data and a totally different function call that I have to look up from the docs.
R has better data structures I think the only workable construct in Matlab is the basic array and working with anything else than numeric variables is awkward, further you can't call the columns by name, but you have to use the index of the variable (hmm, was it column 33 or 34, that I wanted to plot ...) . You can't beat the data.frame in R!
R has a lot of useful packages for ML
Matlab has no named arguments to functions
Finally if you work a lot with matrices and find the Matlab syntax nicer then check out Python with Numpy and Scipy. Python also has some nice ML libraries such as PyBrain. I'm not going to compare R and Python here, because thats an entirely different question :)