I have been using MATLAB for my work, but I have started learning Python lately. I employ statistical analysis, more precisely geostatistics, in my work. I was wanting to ask, from your perspectives, which one among the two languages is good for statistical analysis? What are the pros and cons, other than accessibility, for each?
The SciPy and NumPy libraries for Python add in a ton of MatLab-equivalent functionality, to the point where it might very well have surpassed MatLab as a scientific-computing resource.
As a language, I'd say Python is (in my opinion) far superior - function definition, imports, et cetera are all a lot nicer to work with than MatLab's more primitive equivalents.
That said, there is a lot of pre-written MatLab code out there for analysis, given that it was such a mainstay for such a long time.
I would pick Python because it can be a powerful as Matlab but is free. Also, you can distribute your applications for free and no licensing chains.
Matlab is awesome and expensive (it had a great statistical package) and it will glow smoother than Python in the beginning, but not so in the long run.
Now, if you really want the best solution then check out R, the statistical package which is de facto in the community. They even have a Python port for it. R is also free software.
MATLAB
- Good for beginners
- Good for interactive sessions
Python (with SciPy)
- Good for slightly experienced programmers
- Good for creating reusable applications
- Good for reading and exporting data files
- Free of cost
If SciPy doesn't provide all the functionality out of the box, then you may have to go searching on the Internet. I am not an expert on geostatistics, but here is a mail with some starting pointers. http://mail.scipy.org/pipermail/scipy-user/2007-November/014434.html
I also heard that Python + R is good, but I haven't tried it.
EDIT: Add link to Python + R: http://rpy.sourceforge.net/