views:

719

answers:

12

Scientific computing is algorithm intensive and can also be data intensive. It often needs to use a lot of memory to run analysis and release it before continuing with the next. Sometime it also uses memory pool to recycle memory for each analysis. Managed language is interesting here because it can allow the developer to concentrate on the application logic. Since it might need to deal with huge dataset, performance is important too. But how can we control memory and performance with managed language?

+1  A: 

Not exactly sure what the question is, but you might want to check out Fortress

Kevin Wong
+5  A: 

You are asking a fundamentally flawed question. The entire point of managed languages is that you don't handle memory. That is handled by the Garbage Collector, while you can make certain actions to better allow it to do its job in an efficient manner, it is not your job to do its job.

The things you can do to improve performance in a world where performance is not controlled by you are simple. Make sure you don't hold onto references you don't need. And use stack based variables if you need more control over the situation.

Guvante
Garbage collected languages do often give you control over memory layout though, e.g. structs in .NET.
Jon Harrop
+5  A: 

F# seems to be somewhat targeted at this audience. There is actually a book called F# for scientists.

Also this question was asked over at Lambda the Ultimate.

Steve Steiner
+1  A: 

I would think that functional languages would be best suited to this type of task.

Ed Swangren
A: 

With a managed language you don't get that control as easily. The whole point in these languages is to handle malloc, garbage, and so on. Each managed language will handle that differently.

With Perl running out of memory is considered a fatal error. You can save the day via some small measure with $^M but this is only if your compiler has been compiled with that feature, and you add code provisions for it.

J.J.
+5  A: 

You might be surprised at the number of people that use Matlab for this, and as it could be considered a programming language and certainly manages its own memory (with support for huge data sets, etc) then it should seriously be considered as a solution here.

Further, it will generate program code (may require a separate plugin?) so once you arrive at an algorithm you want to package up you can have it generate the C code to perform the work you originally had in your M script or simulink model.

Adam Davis
+1  A: 

I think I would paraphrase the question by saying is the .NET memory manager capable of handling the job of memory management for scientific computing where traditionally hand tuned routines have been used for improving memory performance, especially for very large (GByte) matrices?

The author of this article certainly believes that it is: Harness the Features of C# to Power Your Scientific Computing Projects

As others have pointed out, a major point of managed code is that you don't need to deal with memory management tasks yourself. This is a major advantage as it allows you to concentrate on the algorithms.

David Dibben
A: 

Because of its overhead, a .NET application will incur a performance penalty relative to an unmanaged application. However, because this overhead is more-or-less a constant unrelated to the overall size of the application (WARNING: over-simplification), it becomes relatively less of a penalty the larger the application.

So I would go with .NET (so long as it provides you with the libraries you need). Managing memory is a pain, and you have to do it a lot to be good at it. Within .NET, choose whatever language you're most comfortable, so long as it's not J# or VB.NET and is C#.

MusiGenesis
Wow, I don't know *what* the heck I was talking about in that first paragraph. I must have been drunk. I still think C# is an excellent choice for scientific computing software, but geez.
MusiGenesis
+9  A: 

Python has become pretty big in scientific computing lately. It is a managed language, so you don't have to remember to free your memory. At the same time, it has packages for scientific and numerical computing (NumPy, SciPy), which gives you performance similar to compiled languages. Also, Python can be pretty easily integrated with C code.

Python is a very expressive language, making it easier to write and read than many traditional languages. It also resembles Matlab in some ways, making it easier to use for scientists than, say, C++ or Fortran.

The University of Oslo has recently starting teaching Python as the default language for all science students outside the department of informatics (who still learn Java).

Simula Research Laboratory, which is heavily into scientific computing, partial differential equations etc., uses python extensively.

knatten
Python may be adequate for calling existing code written in performant languages but it is obviously wholly inadequate for writing performant code, which was the subject of this question.
Jon Harrop
Which is exactly what's going on with NumPy and SciPy. They are implemented in C, wrapped in python modules, and are roughly as efficient as the equivalent C code.So I would argue that my answer is still relevant, and not at all inadequate.
knatten
I think the OP wants to write new performant code himself and not just call someone else's code from Python.
Jon Harrop
Jon, NumPy and SciPy are python libraries that can be used to write performant scientific applications entirely in Python.
knatten
+2  A: 

BlackBox Component Builder, developed by Oberon microsystems, is the component-based development environment for the programming language „Component Pascal“.

Due to its stability, performance and simplicity, BlackBox is perfectly suited for science and engineering applications.

http://www.oberon.ch/blackbox.html

(Disclosure: I work for Oberon microsystems)

Regards, tamberg

tamberg
A: 

The best option is Python with NumPy/ SciPy/ IPython. It has excellent performance because the core math is happening in libraries written in highly optimized C and Fortran. Since you interact with it using Python, everything from your perspective is clean and managed with extremely succinct, readable code and garbage collection.

indentation
+1  A: 

The short answer is that you can control the memory and performance of programs written in managed languages by choosing a suitable language (like OCaml or F#) and learning how to optimize in that language. The long answer requires a book on the specific language you are using, such as OCaml for Scientists or Visual F# 2010 for Technical Computing.

The subjects you need to learn about are algorithmic optimizations, low-level optimizations, data structures and the internal representation of types in your chosen language. If you are writing parallel algorithms then it is also particularly important to learn about caches.

Jon Harrop