views:

545

answers:

4

I am quite excited by the possibility of using languages which have parallelism / concurrency built in, such as stackless python and erlang, and have a firm belief that we'll all have to move in that direction before too long - or will want to because it will be a good/easy way to get to scalability and performance.

However, I am so used to thinking about solutions in a linear/serial/OOP/functional way that I am struggling to cast any of my domain problems in a way that merits using concurrency. I suspect I just need to unlearn a lot, but I thought I would ask the following:

  • Have you implemented anything reasonably large in stackless or erlang or other?
  • Why was it a good choice? Was it a good choice? Would you do it again?
  • What characteristics of your problem meant that concurrent/parallel was right?
  • Did you re-cast an exising problem to take advantage of concurrency/parallelism? and
  • if so, how?

Anyone any experience they are willing to share?

+1  A: 

The problems will stay almost the same inf future, but the underlying hardware for the realization is changing. To use this, the way of compunication between objects (components, processes, services, how ever you call it) will change. Messages will be sent asynchronously without waiting for a direct response. Instead after a job is done the process will call the sender back with the answer. It's like people working together.

I'm currently designing a lightweighted event-driven architecture based on Erlang/OTP. It's called Tideland EAS. I'm describing the ideas and principles here: http://code.google.com/p/tideland-eas/wiki/IdeasAndPrinciples. It's not ready, but maybe you'll understand what I mean.

mue

Mue
It's not limited to the way objects communicate. The whole architectural reasoning must shift - just as it did for OOP - otherwise you'll have concurrently architected serial apps, which is pointless. You are right about the hardware; it's already happened with dual core CPUs which we don't use.
Simon
Yep, it's a paradigm shift. I've only used the word objects to let it be more familiar. In some way Erlang processes are a special kind of objects. Actor systems have been influenced by the message passing of Smalltalk and Erlang is like Smalltalk-72 in a number of ways.
Mue
A: 

Erlang makes you think of the problem in parallel. You won't forget it one second. After a while you adapt. Not a big problem. Except the solution become parallel in every little corner. All other languages you have to tweak. To be concurrent. And that doesn't feel natural. Then you end up hating your solution. Not fun.

The biggest advantages Erlang have is that it got no global garbage collect. It will never take a break. That is kind of important, when you have 10000 page views a second.

Flinkman
+2  A: 

Before we had operating systems people building applications would sit down and discuss things like:

  • how will we store data on disks
  • what file system structure will we use
  • what hardware will our application work with
  • etc, etc

Operating systems emerged from collections of 'developer libraries'.

The beauty of an operating system is that your UNWRITTEN software has certain characteristics, it can:

  • talk to permanent storage
  • talk to the network
  • run in a command line
  • be used in batch
  • talk to a GUI
  • etc, etc

Once you have shifted to an operating system - you don't go back to the status quo ante...

Erlang/OTP (ie not Erlang) is an application system - it runs on two or more computers.

The beauty of an APPLICATION SYSTEM is that your UNWRITTEN software has certain characteristics, it can:

  • fail over between two machines
  • work in a cluster
  • etc, etc...

Guess what, once you have shifted to an Application System - you don't go back neither...

You don't have to use Erlang/OTP, Google have a good Application System in their app engine, so don't get hung up about the language syntax.

There may well be good business reasons to build on the Erlang/OTP stack not the Google App Engine - the biz dev guys in your firm will make that call for you.

Gordon Guthrie
+1 Very interesting response, thank you. However I feel like there's a difference between algorithmic parallelism and distributed serial processing, which is what I think models like GAE represent. I haven't quite made my mind up about this yet.
Simon
+1  A: 

Hi, in the past when desktop machines had a single CPU, parallelization only applied to "special" parallel hardware. But these days desktops have usually from 2 to 8 cores, so now the parallel hardware is the standard. That's a big difference and therefore it is not just about which problems suggest parallelism, but also how to apply parallelism to a wider set of problems than before.

In order to be take advantage of parallelism, you usually need to recast your problem in some ways. Parallelism changes the playground in many ways:

  • You get the data coherence and locking problems. So you need to try to organize your problem so that you have semi-independent data structures which can be handled by different threads, processes and computation nodes.
  • Parallelism can also introduce nondeterminism into your computation, if the relative order in which the parallel components do their jobs affects the results. You may need to protect against that, and define a parallel version of your algorithm which is robust against different scheduling orders.
  • When you transcend intra-motherboard parallelism and get into networked / cluster / grid computing, you also get the issues of network bandwidth, network going down, and the proper management of failing computational nodes. You may need to modify your problem so that it becomes easier to handle the situations where part of the computation gets lost when a network node goes down.
antti.huima

related questions