How often is the performance of a programming language a significant issue?

views:

238

answers:

+3 Q:

How often is the performance of a programming language a significant issue?

It seems that I often hear people criticize certain programming languages because they "have poor performance", or because some other language is "faster" in general (not necessarily for a specific application). However, my experience and education have taught me that anytime you have a performance problem, at least one of the following is probably happening:

The bottleneck isn't in the CPU, it's in some other device, such as the network or the hard drive.
The poor performance is caused by your algorithms, not by the language you're using.

My general impression is that the speed of a programming language itself is all but irrelevant in the vast majority of cases, with exceptions for serious data processing problems. Even in those cases, I believe you could use a hybrid approach and use a lower-level language only for the CPU-intensive pieces so that you wouldn't lose the benefits of the more abstract language altogether.

Do you agree? Is programming language speed insignificant most of the time, or do the critics have a right to point out language performance issues?

I hope this question isn't too subjective, but it seems to me that there should be a relatively objective answer to this.

+1 A:

This is impossible to answer so broadly. It's like asking if big engines are a waste in cars. Well, for some people, yes. For others, not at all. And all sorts in between.

There are a myriad of factors that come in to play. What is your target environment? End-user deployment or servers? Let's suppose we're talking about web development and coding for a server. RoR is well-known to be (relatively) slow. .NET is pretty fast by comparison. But RoR also has RAD qualities that .NET can't compete with.

Is getting your app up-and-running yesterday more of a priority than scalability?
Does your business model live or die on the milliseconds you serve a page, or the time you went to market?
Does your TCO and application architecture support scaling out or scaling up? Do you even expect to need to scale up?

Those are just a tiny handful of the questions an architect has to answer when making platform/language decisions. Does speed matter? Sometimes. If I am planning to write a LoB service that will eventually need to scale to thousands of transactions per second, and it will be deployed in an enterprise environment, I will probably go with .NET. If I have an idea for a Web 2.0 business like selling Twitter teeshirts, I need to capitalize on that idea yesterday and I can know I probably won't get slammed with enough business to bring the site down before preparing for it.

This is honestly over-simplifying a very complex issue, but hopefully illustrated the point that it's impossible to simply "say" whether it matters or not.

Rex M 2009-05-19 03:35:45

+ I agree, in actual practice there are a lot of factors to consider, and you're right to point them out. However, I feel that if one tries to clear away the "fog", there's an objective reality underneath that can be discussed. Then the fog can be added back in.

Mike Dunlavey 2009-05-19 14:05:59

+3 A:

Performance can be a serious concern in libraries, operating systems, and the like. However, I believe that upwards of 90% of the time raw performance is irrelevant.

What is more important in many cases is TIMING. Any garbage collected language is going to have some unpredictability in this regard, which makes them unsuited to embedded and realtime design spaces.

The overlap of GC'd and "slow" languages is considerable, and so you may see a language discounted for speed reasons when the real problem is inconsistent timing.

There are some allocation/threading/etc. schemes that allow for garbage collection while also guaranteeing the runtime of parts of the system, such as Realtime Java, though I haven't personally seen it in use anywhere.

Short answer: most of the time the speed of the language is irrelevant (within reason), language choices are made based on familiarity and available libraries.

Kevin Montrose 2009-05-19 03:37:10

Why are people downvoting this answer? It isn't necessarily the way I would answer this question but it is a thoughtful contribution to the discussion.

Mark Brittingham 2009-05-19 03:52:28

+ @Mark: I agree, it is a thoughtful answer.

Mike Dunlavey 2009-05-19 13:28:15

+1 A:

Amazingly, the performance of a system is a combination of the programming language, the system it's executing on, the operations that system is performing and the external resources (network, disk, slow line printers, etc.) that it relies upon.

If your system is slow, rather than guessing, test it.

If there is any "Rule" in computing, it's "Test your assumptions". Everything else is gross guideline.

Will Hartung 2009-05-19 03:38:21

You're right, Will, but people do say things like: "I assert PredicateX(languageY, languageZ)", and others variously agree or disagree. I think there's no harm in asking if these assertions are more than completely subjective.

Mike Dunlavey 2009-05-19 20:30:58

+1 A:

I think it's a good question. To answer it requires having a general framework for thinking about performance, so let me try to provide one. (Some of this is going to sound really obvious, but bear with me.)

To keep things simple, let's just consider the simple case of applications that have a specific job to do, and that start, and then finish, and what you care about is wall-clock time. Let's assume a standard CPU cycle rate, and a mono-processor.

The time duration consists of a stream of time-slices (nanoseconds, say). To do that job, there is a minimum amount of time required, and it is usually greater than zero. There is no maximum amount of time required. If a program spends longer than the minimum number of nanoseconds, then some of those nanoseconds are being spent, strictly speaking, unnecessarily (i.e. for poor reasons).

So, to optimize a program's execution time, it is necessary to find the nanoseconds it is spending that do not have to be spent (i.e. that do not have good reasons) and remove them.

One way to do this is to, if possible, step through the program and keep track at each step of why it is doing that step. If the reason is not good, there is an opportunity for removing steps.

Another way to do this is to select nanoseconds at random from the program's execution, and inquire their reasons. For example, the program counter can tell you what the program is doing, but the call stack can tell you why. In order for the nanosecond to be spent for a good reason, every call instruction on the call stack has to have a good reason. If any instruction on the call stack does not have a good reason, then there is an opportunity to optimize. In fact, the amount of time that instruction is on the call stack is the amount of time that would be saved by its removal.

In some kinds of software that are highly asynchronous, message-driven, or interpreted, the call stack may not provide enough information. In that case, to answer why a given nanosecond is being spent may be more difficult. It may require examining more state information than just the call stack. For example, in an interpreter, the stack of the program being interpreted may also need to be examined. However, often the hardware call stack does provide sufficient information, so it is a useful thing to examine.

Now, to try to answer your question.

There is such a thing as a "hot spot". This is a small set of addresses that are often at the bottom of the call stack. Nanoseconds spent in that code may or may not have good reasons.

There is such a thing as a "performance problem". This is an instruction that often accounts for why nanoseconds are being spent, but that does not have a good reason. Such an instruction may be in a hot spot. It may also be a subroutine call instruction. (It cannot be both.) It may be an instruction to send a message to be processed later, that does not have a good reason for being spent. To optimize software, such instructions (not functions) are what are being looked for.

Languages, loosely speaking, are either compiled into machine language or interpreted. Interpreted languages are usually 1 or 2 orders of magnitude slower than compiled, because they are constantly re-determining what they need to do. However, roughly speaking, this is only a performance problem if it occurs in a hot spot. If a program spends all its time calling compiled library functions, or waiting for I/O completions, then its speed of execution probably doesn't matter, because most of the nanoseconds are being spent for other reasons.

Now, certainly, any language or program can in principle be highly non-optimal, but in terms of compilers, for hotspot code, they are mostly pretty good, give or take maybe 30%. If there is a background process involved, like garbage collection, that adds an overhead, but it depends on the rate at which the program generates garbage.

So to sum up, the speed of a language matters in hotspot code, but not much elsewhere. When a program has been optimized by removal of all other performance problems, and if the hotspot code is actually seen by the compiler/interpreter, then speed of language matters.

Mike Dunlavey 2009-05-19 12:47:58

+1 A:

Your question is framed very broadly, so I'll try to give a somewhat narrower answer:

Unless there is some good reason not to do so, the language for a project should always be chosen from among those languages that will help the project team be productive and produce reliable software that can easily be adapted for future needs. The tradeoffs generally favor high-level languages with automatic memory management.

N.B. There are plenty of good reasons to make other choices, such as compatibility with current products and libraries.
It sometimes happens that when a program is too slow, the quickest and easiest way to speed it up is to rewrite the program (or a critical part) in a new language. This happens most often when the implementation language is interpreted and the new language is compiled.

Example: I got about a 4x speedup out of the OSBF-Lua spam filter by rewriting the lexical analysis of the mail headers. By rewriting from Lua to C I not only went from interpreted to compiled but was able to eliminate an array-bounds check for every input character.

To answer your question as stated, it is not very often that language performance per se is an issue.

Norman Ramsey 2009-05-19 23:07:11

ansaurus

tags:

views:

answers:

How often is the performance of a programming language a significant issue?

related questions