views:

226

answers:

6

I have done a couple research jobs in Bio-informatics and I have used Matlab for them. Matlab had a lot of powerful tools and was easy to use. I did thinks with genome sequencing and predicting metabolic pathways. I am wondering what other people think is best? or there might not be one specific language but a few that lend themselves best to Bio-informatics work that is math heavy and deals with a large amount of data.

+3  A: 

Python + scipy are decent (and FREE).

http://www.vetta.org/2008/05/scipy-the-embarrassing-way-to-code/

http://www.google.com/search?hl=en&source=hp&q=python+bioinformatics&aq=0&aqi=g9g-m1&aql=&oq=python+bio&gs_rfai=CeE1nPpMNTN2IJZ-yMZX6pcIKAAAAqgQFT9DLSgo

You do not even need to learn new syntax really when when dropping Matlab for SciPy.

Hamish Grubijan
My university uses Python for all of the bioinformatics stuff.
Brendan Long
+3  A: 

Best or not, SAS is the de facto programming enviroment in biopharmas. If you were to work for the Pfizers, Mercks and Bayers of the world in bioinformatics, you had better have SAS skills. SAS programmers are in great demand.

Mikos
Mikos lives in Cambridge; he knows what he is talking about.
Hamish Grubijan
This might change, however. Best tools do win at the end. The downside of Python is that it is supported by volunteers.
Hamish Grubijan
de facto or not, this depends on which university you come from. Some new school love python.
J-16 SDiZ
I also have worked with biopharmas extensively as a consultant, hence speaking from over a decade's experience. This industry is highly regulated and glacial pace of change, not known for switching technologies. @Sdiz - Python might be popular in academia, but industry is a different ballgame.
Mikos
@Mikos: You're right of course, but I can't bring myself to upvote an answer that recommends using SAS.
Richie Cotton
Neither was I eliciting an upvote, nor "recommending" SAS. My post was a 'sic stat' on the industry, should the OP choose to work there. Of course, academia offers much more freedom of choice of toolkits etc.
Mikos
My experience in bioinformatics at small and large biopharma suggests that SAS was de facto for *statistics*, whereas R was overwhelmingly more common in *bioinformatics*
bubaker
@mikos, are you sure about that? SAS tag has only 200 instances on stackoverflow, i dnot think u are right
i am a girl
@mikos: please respond to this. i really dont think its worth learning SAS unless you are going to something highly specialized. the statistics on SOF show it
i am a girl
@I am a girl - you are entitled to your viewpoint. But I also do not think that posts on SO are representative of SAS' usage in biopharma. I stand by my statement. If you need to validate this, talk to statisticians who work in pharma.
Mikos
+11  A: 

You'll likely be interested in this thread over at BioStar:

For most of us bioinformaticians, this includes Python, R, Perl, and bash command line utilities (like sed, awk, cut, sort, etc). There are also people who code in Java, Ruby, C++, and Matlab.

So the bottom line? Whichever language lets you get the work done most easily is the right one for you. Answering this question should include a careful survey of the libraries and other code that you can pull from, as well as information on your own preferences and experience. If you're doing microarray analysis, it's hard to beat the R/bioconductor libraries, but that's absolutely the wrong language for someone wrangling most types of large sequencing data sets.

chrisamiller
Very good point. It is sometimes not so much about the language but the library. Doing matrix manipulation in Perl seems crazy, C++/Java - not bad, Python has SciPy, and it is native in Matlab. If regular expressions are heavily used, then Perl can be a good candidate, Ruby, Python, Java and even C++ as well. It all depends. I am biased towards Python :)
Hamish Grubijan
+4  A: 

There's no one right language for bioinformatics.

  • The important BLAST sequencing tool is written in C++

  • The MATT tool for aligning protein structures is written in C

  • Some of my colleagues in computational biology use Ruby.

In general, I see a lot of C and C++ for performance-critical code and a lot of scripting languages otherwise.

Norman Ramsey
A: 

What's the "best" language is both subjective and potentially different from task to task, but for bioinformatic work, I personally use R, Perl, Delphi and C (quite frequently a combination of several of these).

PhiS
A: 

I work mainly with HMMs and protein sequences. I started out writing in C, but have since switched to Python, which I'm happy with. I find it's easier to prototype something quickly and results in easier to maintain code.

Colin

related questions