views:

1196

answers:

15

Hi all,

I am working to become a scientific programmer. I have enough background in Math and Stat but rather lacking on programming background. I found it very hard to learn how to use a language for scientific programming because most of the reference for SP are close to trivial.

My work involves statistical/financial modelling and none with physics model. Currently, I use Python extensively with numpy and scipy. Done R/Mathematica. I know enough C/C++ to read code. No experience in Fortran.

I dont know if this is a good list of language for a scientific programmer. If this is, what is a good reading list for learning the syntax and design pattern of these languages in scientific settings.

+7  A: 

In terms of languages, I think you have a good coverage. Python is great for experimentation and prototyping, Mathematica is good for helping with the theoretical stuff, and C/C++ are there if you need to do serious number crunching.

I might also suggest you develop an appreciation of an assembly language and also a functional language (such as Haskell), not really to use, but rather because of the effect they have on your programming skills and style, and of the concepts they bring home to you. They might also come in handy one day.

I would also consider it vital to learn about parallel programming (concurrent/distributed) as this is the only way to access the sort of computing power that sometimes is necessary for scientific problems. Exposure to functional programming would be quite helpful in this regard, whether or not you actually use a functional language to solve the problem.

Unfortunately I don't have much to suggest in the way of reading, but you may find The Scientist and Engineer's Guide to Digital Signal Processing helpful.

Artelius
I have strong appreciation of Haskell :)
leon
In that case, learn assembly language. IMO the best way to do that is to write a toy kernel in assembly language, because you'll learn a million things besides.
Artelius
Oh yeah, and there's always The Art of Computer Programming (by Knuth)
Artelius
You will learn a million things by learning assembler, but that's something like saying to learn biology, study physics first. Sure you'll learn a ton, but (a) not everyone needs to understand everything about how computers or software work deep down (though more general knowledge is a fine thing to have), and (b) there are other paths more immediately applicable to his field of inquiry that could also provide much insight.
mlimber
@mlimber: it's a matter of opinion. Note that I used "suggest" and "IMO" about this issue. The OP should choose something that suits him.
Artelius
+4  A: 

I would suggest any of the numerical recipes books (pick a language) to be useful.

Depending on the languages you use or if you will be doing visualization there can be other suggestions.

Another book I really like is Object-Oriented Implementation of Numerical Methods, by Didier Besset. He shows how to do many equations in Java and smalltalk, but what is more important is that he does a fantastic job with helping to show how to optimize equations for use on a computer and how to deal with errors because of limitations on the computer.

James Black
+1 for Besset. NR books need to be taken with a grain of salt--code is awful, though usually functional.
Drew Hall
I will never forgive NR (even 3rd ed, 2007) for advising people to pad signals with zeroes up to a power of two. So much work ruined... :-(
Jon Harrop
+1  A: 

For generic C++ in scientific enviroments, Modern C++ Design by Andrei Alexandrescu is probably the standard book about the common design patterns.

Georg Fritzsche
MC++D is a fantastic book, but it's not for C++ beginners like the OP, nor is it any more useful for specifically scientific applications than is the GoF's original _Design Patterns_.If you don't know how to write your own template classes and functions and partially specialize them, for instance, you'll need a firmer grounding in the language before picking up this book.
mlimber
I don't know about the specific needs of the OP, but for "design patterns in [some] scientific enviroments" its a valuable foundation imo. Some lab-teams here see it as the initial must-read, thats why i brought it up.
Georg Fritzsche
A: 

this might be useful: the nature of mathematical modeling

David Lehavi
+7  A: 

My first suggestion is that you look at the top 5 universities for your specific field, look at what they're teaching and what the professors are using for research. That's how you can discover the relevant language/approach.

Also have a look at this stackoverflow question ("practices-for-programming-in-a-scientific-environment").

You're doing statistical/finance modeling? I use R in that field myself, and it is quickly becoming the standard for statistical analysis, especially in the social sciences, but in finance as well (see, for instance, http://rinfinance.com). Matlab is probably still more widely used in industry, but I have the sense that this may be changing. I would only fall back to C++ as a last resort if performance is a major factor.

Look at these related questions for help finding reading materials related to R:

In terms of book recommendations related to statistics and finance, I still think that the best general option is David Ruppert's "Statistics and Finance" (you can find most of the R code here and the author's website has matlab code).

Lastly, if your scientific computing isn't statistical, then I actually think that Mathematica is the best tool. It seems to get very little mention amongst programmers, but it is the best tool for pure scientific research in my view. It has much better support for things like integration and partial differential equations that matlab. They have a nice list of books on the wolfram website.

Shane
+16  A: 

At some stage you're going to need floating point arithmetic. It's hard to do it well, less hard to do it competently, and easy to do it badly. This paper is a must read:

What Every Computer Scientist Should Know About Floating-Point Arithmetic

Tim
+1, this is probably one of the most fundamental things in scientific computing
Artelius
+15  A: 

Hi

I thoroughly recommend

Scientific and Engineering C++: An Introduction with Advanced Techniques and Examples by Barton and Nackman

Don't be put off by its age, it's excellent. Numerical Recipes in your favourite language (so long as it is C,C++ or Fortran) is compendious, and excellent for learning from, not always the best algorithms for each problem.

I also like

Parallel Scientific Computing in C++ and MPI: A Seamless Approach to Parallel Algorithms and their Implementation by Karniadakis

the sooner you start parallel computing the better.

Regards

Mark

High Performance Mark
Do not, under any circumstances, use Numerical Recipes to try to learn a programming language.
Graphics Noob
Shit, too late, by about 25 years. Oh, what a wasted life. And I stand by my comment that NR is an excellent text for learning scientific programming, which is about a lot more than a programming language.
High Performance Mark
Numerical Recipes was ok 25 years ago but it is a joke today.
Jon Harrop
+1  A: 

Once you are up and running, I would strongly recommend reading this blog.

It describes how you use C++ templates to provide type safe units. So for example, if you multiply velocity by time you get a distance etc.

Richard Corden
You might also be interested in "units of measure" in Microsoft's new F# programming language.
Jon Harrop
+3  A: 

Donald Knuth's book on seminumerical algorithms.

Kinopiko
+2  A: 

MATLAB is widely used in engineering for design, rapid development, and even production applications (my current project has a MATLAB-generated DLL for doing some advanced number crunching that was easier to do than in our native C++, and our FPGAs use MATLAB-generated cores for signal processing too, which is much easier than coding the same by hand in VHDL). There's also a financial toolbox for MATLAB that may be of interest to you.

This is not to say that MATLAB is the best choice for your field, but at least in engineering, it's widely used and not going anywhere soon.

mlimber
+2  A: 

One issue scientific programmers face is maintaining a repository of code (and data) that others can use to reproduce your experiments. In my experience this is a skill not required in commercial development.

Here are some readings on this:

These are in the context of computational biology but I assume it applies to most scientific programming.

Also, look at Python Scripting for Computational Science.

pufferfish
Nice link to the Quick Guide to Organizing Computational Biology Projects. Professor Noble does wonderful work. :)
James Thompson
A: 

Donald Knuth: Seminumerical Algorithms, Volume 2 of The Art of Computer Programming

Press, Teukolsky, Vetterling, Flannery: Numerical Recipes in C++ (the book is great, just beware of the license)

Modern C++ Design

and have a gander at the source code for the GNU Scientific Library.

Jason
The license... and the awful code and advise.
Jon Harrop
+4  A: 

I'm a scientific programmer who just entered the field in the past 2 years. I'm into more biology and physics modeling, but I bet what you're looking for is pretty similar. While I was applying to jobs and internships there were two things that I didn't think would be that important to know, but caused me to end up missing out on opportunities. One was MATLAB, which has already been mentioned. The other was database design -- no matter what area of SP you're in, there's probably going to be a lot of data that has to be managed somehow.

The book Database Design for Mere Mortals by Michael Hernandez was recommended to me as being a good start and helped me out a lot in my preparation. I would also make sure you at least understand some basic SQL if you don't already.

Jelly
A: 

Reading source-code helps a lot, too. Python is great in this sense. I have learnt a great amount of information just by digging through the source codes of scientific Python tools. On top of this following your favourite tools' mailing-lists and forums can enhance your skills further.

Gökhan Sever
A: 

Writing Scientific Software: A Guide to Good Style is a good book with overall advice for modern scientific programming.

e.tadeu