views:

311

answers:

2

I recently started work on a research project that has to do with evolving cellular automata rules (for a given task, follow the link if you're curious).

I am currently evaluating options for tools that could be used for the job, here's what I have so far:

  • MASON + ECJ
  • Swarm
  • MATLAB
  • MATHEMATICA
  • Some ad-hoc Python CA implementation + Pyevolve
  • from scratch, develop my own framework

Cosidering that:

  1. MASON/Swarm learning curves are not exactly steep
  2. I know close to nothing about MATLAB/MATHEMATICA
  3. I know very little Python
  4. I already developed my own home-grown GA framework in C# before

I am tempted to roll my own CA simulation from scratch and hook it up with my dodgy framework. I think it could take me a bit of time to crank smt out but I'd be able to better estimate the extent of work required, compared to learning some new language/platform.

The risk is (obviously) reinventing the wheel and running into issues/delays (I am mainly thinking performance optimization related stuff that could force me to spend a lot of time optimizing my ad-hoc code) that would make the DIY option worse than the rest, considering that some of those framework mentioned have been developed specifically for heavy computation etc.

This is a very important choice for this project (being that it is a 6 months project) so I'd like to hear people's opinions/experiences.

IMPORTANT NOTE: this is for a MSc final research project which is supposed to take me a total 6 months, so I do not have a lot of time to invest as I would have on a Phd effort.

+1  A: 

I've worked doing large numerical simulations for some time now, and here is what I've learn:

In general, coding in a tool/language you do not know well will likely be slower and harder to maintain than coding using tools you are very familiar and comfortable with. That said,

Matlab:

This is my first choice to do something quick and dirty, or something that requires plotting. Sometimes I will write the simulation code in C++ and import the results (as a text file) to Matlab after for post-processing. The learning curve is not very steep, but it will take some days to get familiar with the way Matlab goes about things. At the beginning, it is better if you have other people around that has been using it for a while. There are also good online resources.

Matlab is very powerful for prototyping, and relatively straightforward, has good debugging support. Syntax is not complex, but can be tricky for optimization and code vectorization. Octave is a free open-source alternative. Usually fast, but it is a bit of a resource hog. Large projects may get difficult to manage.

Mathematica

This is what Wolfran used for 'A new kind of science', which deals with Cellular Automata. You may like the book or not, but Mathematica is likely a good tool for your problem. While Matlab is more towards the numerical end, Mathematica is more towards the symbolic end. It has very good visualization tools.

That said, I've used it from one of my grad school courses, and found Mathematica's syntax so frustrating (coming from a Matlab background) that I could not use it.

C++

This is my tool of choice whenever I have time to put the right amount of time into a problem. It just feels right, and it can be compiled/executed in pretty much anything. Performance wise is very good, and you can really trim and optimize code if you know what you are doing. Tons of libraries available freely, including multiple Cellular Automata ones. Learning curve is steep beyond the basic usage. But it is my first choice.

Python

I have very little experience with Python, but people who uses it swears by it. There is a collection of tools called sciPy you should check. There is at least one Cellular Automata toolkit (Google for 'Python CAGE').

C#

What I do not like about C# is that it can be killed or modified at the whim of a company from one day to the next (same with Matlab and Mathematica) - if your research will span years, this is risky, as 3 years from now you may need to go again through old code and fix things you did not broke.

But if you have a C# library that works, why not just use C# for all of it? Matlab will be fast at the beginning, but if the project gets too complex, it will end up being a pain to maintain. C++ will take some time to learn, but will pay back big time for you in the long term. Same with Python. Not sure what to think about Mathematica.

cjcela
Tnx a lot for your point of view - helps to get an idea of what's out there. I have decent skills in C#, C++ and JAVA but at the moment I am leaning towards matlab: I quickly knocked together a CA simulation and the GA toolkit is just too powerful to overlook. You basically have to write just the fitness function (that's where the CA will run) and everything else comes pretty much out of the box.Still experimenting with Swarm and MASON as well, but I am thinking to abandon the DIY option, even if I'd like to I'd rather focus on the actual topic rather than on implementation details.
JohnIdol
+1  A: 

I'm a user of both Matlab and Mathematica. Given your admission of severe lack of knowledge of either, I'd recommend Mathematica. Once you 'get' it's syntax I think it's a better fit for your task than Matlab -- in particular it has (a) much more extensive symbolic capabilities built-in (I've not used Matlab's Symbolic Toolbox 'cos I already have Mathematica so I can't comment on the former), and (b) latest releases of Mathematica provide parallelisation out-of-the-box, Matlab offers the Parallel Computing Toolbox at additional cost.

For a research project I wouldn't be too concerned with execution performance (though you may have good reasons for being concerned with it), I'd be more concerned with development performance, ie how quickly can I develop and test good CAs. Both Matlab and Mathematica have built-in, and very good, visualisation and data handling facilities.

Since you are only facing a 6 month graduate project, I don't know why you should be too concerned at getting locked-in to an individual supplier -- start with one tool, stick with it, don't upgrade during the project.

I also think that you are right to be concerned about re-inventing the wheel by developing a home-brew C# application. What is the objective of your research ? To develop software or to study the behaviour of CAs ? If the latter then you want to have a working implementation as quickly, and as painlessly, as possible so you can get on with your objective.

I don't know how relevant it is to point out, again, that Stephen Wolfram, the 'father' of Mathematica was at one time a major figure in the study of CAs, and that the software he developed is very well suited to that field. Recent releases have built-in CAs -- check the online documentation for details -- which may or may not be useful. There is a host of work already published (in books, in journals, on-line) showing uses of Mathematica in the field of evolutionary algorithms of all descriptions. My impression is that Matlab is much less used in these fields.

High Performance Mark
You're right in suggesting that my concern is not putting together software but running simulations as quickly and painlessly as possible, so I am ruling out more and more the DIY option. With regards to matlab vs mathematica, the reason why I was thinking mathematica is, as you say, that it is basically designed by Wolfram for CA. So far, I was able to quickly put together some CA simulations on matlab and I am investigating the matlab GA toolbox which seems to be pretty straightforward to use. I will definitely give a shot at Mathematica though before I make a final decision.
JohnIdol