views:

374

answers:

9

I came across the following statement in Trapexit, an Erlang community website:

Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability.

Also I recall reading somewhere that Twitter switched from Ruby to Scala to address scalability problem.

Hence, I wonder what is the relation between a programming language and scalability?

I would think that scalability depends only on the system design, exception handling etc. Is it because of the way a language is implemented, the libraries, or some other reasons?

Hope for enlightenment. Thanks.

+7  A: 

Erlang is highly optimized for a telecommunications environment, running at 5 9s uptime or so.

It contains a set of libraries called OTP, and it is possible to reload code into the application 'on the fly' without shutting down the application! In addition, there is a framework of supervisor modules and so on, so that when something fails, it gets automatically restarted, or else the failure can gradually work itself up the chain until it gets to a supervisor module that can deal with it.

That would be possible in other languages of course too. In C++, you can reload dlls on the fly, load plugsin. In Python you can reload modules. In C#, you can load code in on-the-fly, use reflection and so on.

It's just that that functionality is built in to Erlang, which means that:

  • it's more standard, any erlang developer knows how it works
  • less stuff to re-implement oneself

That said, there are some fundamental differences between languages, to the extent that some are interpreted, some run off bytecode, some are native compiled, so the performance, and the availability of type information and so on at runtime differs.

Python has a global interpreter lock around its runtime library so cannot make use of SMP.

Erlang only recently had changes added to take advantage of SMP.

Generally I would agree with you in that I feel that a significant difference is down to the built-in libraries rather than a fundamental difference between the languages themselves.

Ultimately I feel that any project that gets very large risks getting 'bogged down' no matter what language it is written in. As you say I feel architecture and design are pretty fundamental to scalability and choosing one language over another will not I feel magically give awesome scalability...

Hugh Perkins
thanks for the rather detailed answer. I am too new to Erlang to really understand but I get the whole argument. thx
idazuwaika
That "five nines of uptime" fact is just a myth. It comes from a silly calculation Joe Armstrong did based on a system that had been running for a number of years and what downtime it had during that window. That is now how reliability is calculated. If it is, than any system running uninterrupted for 5 minutes has 100% reliability over that window.
Christian
@Christian, silly or not silly, the "five nines" is de facto defined in terms of (one year - downtime) / one year. The silliness comes from the fact that for Erlang people prefer to speak about five nines of "uptime" and not five nines of "service availability". The consumer is probably not satisfied if he gets a "five nines" for the supervisor tree, while worker processes are only there for "nine fives" :).
Zed
@Christian, Zed: In the telco world, Zed is totally right. I am certain network operators would be duped by **your** definition of uptime.
jldupont
I would say that the fundamental properties of the languages are very significant in explaining the differences in building scalable systems. They determine the fundamental ways in which you design such systems and their basic properties. Libraries are of course very important but their structure and style are detemined by the properties of the language. So, yes, you can build a message passing based infrastructure on top of C++/Java, but no, it will not be safe as in Erlang as the underlying language does not enforce what is needed to make it safe.
rvirding
+1  A: 

First you have to distinguish between languages and their implementations. For instance ruby language supports threads, but in the official implementation, the thread will not make use of multicore chips.

Then, a language/implementation/algorithm is often termed scalable when it supports parallel computation (for instance via multithread) AND if it exhibits a good speedup increase when the number of CPU goes up (see Amdahl Law).

Some languages like Erlang, Scala, Oz etc. have also syntax (or nice library) which help writing clear and nice parallel code.

paradigmatic
+2  A: 

Erlang is a language designed with concurrency in mind. While most languages depend on the OS for multi-threading, concurrency is built into Erlang. Erlang programs can be made from thousands to millions of extremely lightweight processes that can run on a single processor, can run on a multicore processor, or can run on a network of processors. Erlang also has language level support for message passing between processes, fault-tolerance etc. The core of Erlang is a functional language and functional programming is the best paradigm for building concurrent systems.

In short, making a distributed, reliable and scalable system in Erlang is easy as it is a language designed specially for that purpose.

Vijay Mathew
Erlang was designed for reliability, not specifically for concurrency. The concurrency was a nice bonus.
Tim
+4  A: 

Erlang comes from another culture in thinking about reliability and how to achieve it. Understanding the culture is important, since Erlang code does not become fault-tolerant by magic just because its Erlang.

A fundamental idea is that high uptime does not only come from a very long mean-time-between-failures, it also comes from a very short mean-time-to-recovery, if a failure happened.

One then realize that one need automatic restarts when a failure is detected. And one realize that at the first detection of something not being quite right then one should "crash" to cause a restart. The recovery needs to be optimized, and the possible information losses need to be minimal.

This strategy is followed by many successful softwares, such as journaling filesystems or transaction-logging databases. But overwhelmingly, software tends to only consider the mean-time-between-failure and send messages to the system log about error-indications then try to keep on running until it is not possible anymore. Typically requiring human monitoring the system and manually reboot.

Most of these strategies are in the form of libraries in Erlang. The part that is a language feature is that processes can "link" and "monitor" each other. The first one is a bi-directional contract that "if you crash, then I get your crash message, which if not trapped will crash me", and the second is a "if you crash, i get a message about it".

Linking and monitoring are the mechanisms that the libraries use to make sure that other processes have not crashed (yet). Processes are organized into "supervision" trees. If a worker process in the tree fails, the supervisor will attempt to restart it, or all workers at the same level of that branch in the tree. If that fails it will escalate up, etc. If the top level supervisor gives up the application crashes and the virtual machine quits, at which point the system operator should make the computer restart.

The complete isolation between process heaps is another reason Erlang fares well. With few exceptions, it is not possible to "share values" between processes. This means that all processes are very self-contained and are often not affected by another process crashing. This property also holds between nodes in an Erlang cluster, so it is low-risk to handle a node failing out of the cluster. Replicate and send out change events rather than have a single point of failure.

The philosophies adopted by Erlang has many names, "fail fast", "crash-only system", "recovery oriented programming", "expose errors", "micro-restarts", "replication", ...

Christian
A: 
John Topley
This doesn't answer the fundamental aspect of the question now does it?
jldupont
I was adding more detail to the Twitter example.
John Topley
+2  A: 

In short, the "language" primarily affects the vertical axii of scaling but not all aspects as you already eluded to in your question. Two things here:

1) Scalability needs to be defined in relation to a tangible metric. I propose money.

S = # of users / cost

Without an adequate definition, we will discussing this point ad vitam eternam. Using my proposed definition, it becomes easier to compare system implementations. For a system to be scalable (read: profitable), then:

Scalability grows with S

2) A system can be made to scale based on 2 primary axis:

  • a) Vertical
  • b) Horizontal

a) Vertical scaling relates to enhancing nodes in isolation i.e. bigger server, more RAM etc.

b) Horizontal scaling relates to enhancing a system by adding nodes. This process is more involving since it requires dealing with real world properties such as speed of light (latency), tolerance to partition, failures of many kinds etc.

(Node => physical separation, different "fate sharing" from another)

The term scalability is too often abused unfortunately.


Too many times folks confuse language with libraries & implementation. These are all different things. What makes a language a good fit for a particular system has often more to do with the support around the said language: libraries, development tools, efficiency of the implementation (i.e. memory footprint, performance of builtin functions etc.)

In the case of Erlang, it just happens to have been designed with real world constraints (e.g. distributed environment, failures, need for availability to meet liquidated damages exposure etc.) as input requirements.

Anyways, I could go on for too long here.

jldupont
+1  A: 

In addition to the points made here about Erlang (Which I was not aware of) there is a sense in which some languages are more suited for scripting and smaller tasks.

Languages like ruby and python have some features which are great for prototyping and creativity but terrible for large scale projects. Arguably their best features are their lack of "formality", which hurts you in large projects.

For example, static typing is a hassle on small script-type things, and makes languages like java very verbose. But on a project with hundreds or thousands of classes you can easily see variable types. Compare this to maps and arrays that can hold heterogeneous collections, where as a consumer of a class you can't easily tell what kind of data it's holding. This kind of thing gets compounded as systems get larger. e.g. You can also do things that are really difficult to trace, like dynamically add bits to classes at runtime (which can be fun but is a nightmare if you're trying to figure out where a piece of data comes from) or call methods that raise exceptions without being forced by the compiler to declare the exception. Not that you couldn't solve these kinds of things with good design and disciplined programming - it's just harder to do.

As an extreme case, you could (performance issues aside) build a large system out of shell scripts, and you could probably deal with some of the issues of the messiness, lack of typing and global variables by being very strict and careful with coding and naming conventions ( in which case you'd sort of be creating a static typing system "by convention"), but it wouldn't be a fun exercise.

Steve B.
Erlang is dynamically typed and can certainly build large systems.
Chris Hagan
A: 

You are confusing Language vs Runtime platform. For example Java is a language, but the JVM is the runtime platform. Another example is Python is a language, but there are multiple runtime platforms, CPython, Jython, PyPy, IronPython (.Net). Scala is a language that uses the JVM as its runtime. Erlang is no different, it is a language and a runtime platform. The Platform is just highly optimized and designed for concurrency and high availability.

fuzzy lollipop
A: 

Erlang is at its core based on asynchronous communication (both for co-located and distributed interactions), and that is the key to the scalability made possible by the platform. You can program with asynchronous communication on many platforms, but Erlang the language and the Erlang/OTP framework provides the structure to make it manageable - both technically and in your head. For instance: Without the isolation provided by erlang processes, you will shoot yourself in the foot. With the link/monitor mechanism you can react on failures sooner.

Kresten Krab Thorup