I will be teaching an econometrics course to masters students in the fall. I think it is important for them to learn programming with data as an essential applied research skill. What suggestions do you have for the programming language. I am leaning mostly towards R. What others should I consider?
Since you're interested in R, you could also take a look at Incanter. Since it's built with Clojure - a Lisp dialect for the JVM - you'd be able to leverage the vast array of existing Java libraries.
R is a very good choice. Go for it.
The number of resources on the web keeps increasing. One nice set of slides is provided the UCLA Stat Consulting Center.
And as you are into Econometrics, make sure you look at Grant Farnworth's Econometrics with R on CRAN; the Applied Econometrics with R book by Zeileis and Kleiber is also very good.
I prefer R but other free options to consider would be:
a combination of octave with gnuplot (Octave is a free Matlab implementation)
python with numpy,scipy and matplotlib
I'm surprised no-one else has mentioned Excel. As Brian Ripley once said (see slide 7):
Let’s not kid ourselves: the most widely used piece of software for statistics is Excel.
Indeed, Excel is an excellent tool for adding up columns of numbers. Having said that, if the analysis you are doing is any more complicated than that, you should definitely use a proper programming language.
Of the three obvious data manipulation languages (R, MATLAB and Python), R has the best data manipulation tools. See this other SO question for a more detailed comparison.
EDIT: Upon rereading this, I sound a rather pro-Excel. I'd like to expand my answer to save my reputation.
Excel causes me many more problems than benefits. Its widespread use in my organisation is mostly detrimental. It makes it very hard to trace where data has come from, and how your computations work. Debugging Excel models is near impossible. It encourages local data stores instead of central databases. It doesn't work with diff tools and it makes reproducibility of your science hard. From a semantic point of view, it doesn't separate data and what-is-done-to-the-data. The idea that all your variables need a location distracts from understanding. The plotting capabilities are laughably awful.
All that said, Excel is useful for a few specific things:
As a CSV viewer. Sure, R has the
View
function, but it's not as pretty.Really simple exploration of data. Sorting it, filtering it, adding up columns. I find that these can be done slightly quicker with a point and click interface than with code. Of course, you'll have to write code later for reproducibility, but in the initial stages, Excel is quite nice for this.
The graphs are distinctive and easy to spot. If you see someone give a presentation with a graph drawn in Excel, you know not to trust the results.
That's it. For anything else, it's a mess.