ansaurus

Question

Answer 1

+11 A:

To a degree any code that avoids processor instructions (ie shorter code) will be faster. However how much faster? Not very! Also note that compiler optimisation strategies may result in the smaller code anyway.

These days this is only an optimisation on very specific applications usually in ultra time critical drivers or micro-control code.

Preet Sangha 2010-10-17 09:26:44

Exactly. If global variables were faster for most scenarios, an optimizing compiler would convert parameters to globals as needed. It seems easy to do in an automated way.

dbkk 2010-10-17 10:39:07

I get bored saying 'for optimisations get some real numbers'.

Preet Sangha 2010-10-17 11:02:25

Not true. Longer code can be faster too. And especially in this case, where a global variable will typically result in a cache miss, that could've been avoided with non-global data, there's a very real chance that globals will be *slower*.

jalf 2010-10-17 13:12:22

@dbkk: Converting parameters to globals is not possible. A non-reentrant function is not equivalent to a reentrant one, and the compiler cannot tell when compiling one translation unit whether code may need to be called reentrantly from another.

R.. 2010-10-17 13:30:53

what if there is only one translation unit?

Preet Sangha 2010-10-18 09:52:23

Answer 2

+22 A:

When you switch from parameters to global variables, one of three things can happen:

it runs faster
it runs the same
it runs slower

You will have to measure performance to see what's faster in a non-trivial concrete case. This was true in 1996, is true today and is true tomorrow.

Leaving the performance aside for a moment, global variables in a large project introduce dependencies which almost always make maintenance and testing much harder.

When trying to find legitimate uses of globals variables for performance reasons today I very much agree with the examples in Preet's answer: very often needed variables in microcontroller programs or device drivers. The extreme case is a processor register which is exclusively dedicated to the global variable.

When reasoning about the performance of global variables versus parameter passing, the way the compiler implements them is relevant. Global variables typically are stored at fixed locations. Sometimes the compiler generates direct addressing to access the globals. Sometimes however, the compiler uses one more indirection and uses a kind of symbol table for globals. IIRC gcc for AIX did this 15 years ago. In this environment, globals of small types were always slower than locals and parameter passing.

On the other hand, a compiler can pass parameters by pushing them on the stack, by passing them in registers or a mixture of both.

Peter G. 2010-10-17 09:27:26

I would add that the switch will be hazardous to maintenance.

JoshD 2010-10-17 09:35:33

+1 Any question "will *X* be faster" has only one useful answer: *measure*.

Richard 2010-10-17 09:50:15

This is exactly my expierence from yesterday. I had a function with 5 parameters which is recursive. So i thought it would be a performance gain to make them global. I ended up with one global, and 4 Parameters. 1 time no difference, 1 time faster and 3 times slower. that wasn't what ive expected. so only measuring helps!

s4ms3milia 2010-10-17 09:57:25

@s4ms3milia, what tools did you use to measure that performance?

BeeBand 2010-10-17 10:00:42

I used gprof and simply took the time in the program. My Program easily runs longer then 5 mins so seconds is accurate enough for me. No other tools.

s4ms3milia 2010-10-17 10:04:26

@Josh done, very much agreed

Peter G. 2010-10-17 10:17:23

Wow. How ever did this get upvoted? It’s completely **useless**. “Is it raining in Paris?” – “Well, either it’s raining, or it’s not, or it rains a little. If you must know, travel to Paris. Leaving Paris aside for a moment, I prefer sunny weather.”

Konrad Rudolph 2010-10-17 11:15:20

@Konrad Obviously, I think it's very useful. It helps establish the necessary mindset when tuning performance. Can you offer some constructive critique?

Peter G. 2010-10-17 11:27:38

@Konrad. "How ever did this get upvoted?" - Probably because my question is an example of one of the traps that those new to performance tuning can fall into. Even if the answers all say "You should rely on real world data rather than theory.", then well I've learnt something and by the looks of it other folk have too.

BeeBand 2010-10-17 11:58:27

@Konrad: I have to agree with @Peter G. There are too many variables that are not exposed in the simplistic question, like how many arguments does the function take, what is the calling convention that you are using, what concrete platform will the code run in... By default a function taking 5 integers as arguments will use the stack in ia32, while it will pass all them in registers in x86_64. With the question as stated, the only sensible answer is **measure**

David Rodríguez - dribeas 2010-10-17 12:01:33

It’s a truism. There’s no relation to the question, aside from a very general (and false!) point about performance tuning: namely, that only measuring can generate results. But micro-benchmarks are hard to get right, are often misleading and results differ drastically depending on compilers and architectures. The more interesting issues around argument-passing are completely left out of the answer, which is a fatal flaw: indeed, efficient argument passing is a sore point in C++. Furthermore, of course globals are evil. But even this point should have been expounded.

Konrad Rudolph 2010-10-17 12:07:20

(cont’d) Were it not for cache locality, argument passing would indeed be always inferior to globals, performance-wise. This is an important observation to make, and doesn’t require *any* benchmarks. Taking cache locality into consideration makes this even more interesting, and *now* we may have reached a point where a careful benchmark would be interesting. But how to write such a benchmark? Which variables to look out for? I am really dissatisfied with almost all performance-related answers that simply cry “measure”. They simply dodge the bullet. Provide a benchmark. Then you have an answer.

Konrad Rudolph 2010-10-17 12:11:30

@Konrad, would you mind posting your comments into an answer? It sounds like you really know a lot of useful information about this.

BeeBand 2010-10-17 12:12:15

@BeeBand: I like the answers of Daniel, Crashworks and Clifford. I have nothing more to add, except a conclusive benchmark suite, which I don’t have the time nor the expertise (I suspect) to write. But for a general point about benchmarks, [Producing Wrong Data Without Doing Anything Obviously Wrong!](http://www-plan.cs.colorado.edu/diwan/asplos09.pdf) is worth reading. After that you will think twice before attempting to write such a micro-benchmark.

Konrad Rudolph 2010-10-17 12:18:23

If I had to optimize a specific game, I would start with already available statistics like achieved frame rate or minimum hardware my game enjoyably runs on. Why use a benchmark when you have a concrete program to optimize?

Peter G. 2010-10-17 12:42:47

@BeeBand, ironically, what Konrad is trying to say is that such comments as his *do not constitute an answer*. So it was illogical if Konrad posted them as one.

Pavel Shved 2010-10-17 14:29:07

@Pavel - I asked Konrad to post an answer because i thought some of the terms or assertions he introduced in his comment were unexplained, but possibly useful. Namely "micro-benchmark" ( I don't know what this means it wasn't in the answer ). And also "efficient argument passing is a sore point in C++" - with no qualification of that statement. And after that I got a bit lost since he introduce the term benchmark, and *kept* using it. So I figured I'd ask him to post an answer and I'd ask him on a proper post, rather than discussing things not relevant to someone else's answer.

BeeBand 2010-10-17 14:46:31

@Konrad It sounds to me like you refuse to (recommend to) measure but advocate to benchmark. In my book benchmarks are a true subset of measurings. I cannot follow your logic here.

Peter G. 2010-10-17 14:58:19

@Pavel - I concur, illogical for him. But possibly useful for me. The paper that Pavel posted is 13 pages of very small font highly technical academic paper. A simply worded answer is best.

BeeBand 2010-10-17 15:02:22

@Peter: I use measure and benchmark synonymously. With that in mind, try reading my comments again. Furthermore, no, I don’t advocate against it in principle. I advocate against it as a blanket answer to performance related questions, and I warn against relying on flawed benchmarks for guidance (and how do you know whether a benchmark is flawed? Even most scientific papers get them wrong!).

Konrad Rudolph 2010-10-17 15:29:30

@Konrad... Are you basically saying "Forget measuring, measuring isn't going to give you a good answer. Give me a proper benchmark. But remember, most benchmarks are flawed, and it's impossible to know if their flawed." ?

BeeBand 2010-10-17 20:24:40

@BeeBand: I’m saying several things. Firstly, “go measure” is a bad answer because measuring this kind of effect reliably is exceedingly hard. In order to perform a meaningful measurement, you need a lot of up front information that help you design the benchmark (for measuring) and interpret its results correctly. Secondly, it is a lazy answer. It’s a cop-out. It’s throwing the question back at the questioner. Thirdly, that *after* all the facts have been collected, measuring is of course the only way of knowing *for sure* (for a given compiler/architecture). Theory will only get us so far.

Konrad Rudolph 2010-10-18 08:13:21

Finally, a note on terminology: a *benchmark* is of course a means of obtaining a *measurement*. A *micro-benchmark* is very simply defined as “[d]esigned to measure the performance of a very small and specific piece of code” on Wikipedia. I hope this clears matters up.

Konrad Rudolph 2010-10-18 08:16:43

Answer 3

+13 A:

What do you mean, "faster"?

I know for a fact, that understanding a program with global variables takes me a whole lot more time than one without.

If the extra time it takes the programmer(s) is less than the time gained by the users when they run the program with globals, then I'd say using global is faster.

But consider that the program is going to be run by 10 people once a day for 2 years. And that it takes 2.84632 secs without globals and 2.84217 secs with globals (a 0.00415 sec increase). That's 727 seconds less of TOTAL runtime. Gaining 10 minutes of run time is not worth the introduction of a global as regards programmer time.

pmg 2010-10-17 09:32:30

+1 The only right answer to thiskind of question. The rest argues that the difference is small or propably not there, you prove why it's not worth it ;)

delnan 2010-10-17 09:51:27

+1 Especially if the global variable means introducing 3 hours to fix a problem which could otherwise be fixed in 10 mins.

Helper Method 2010-10-17 10:00:43

It's mostly true. If you're writing a game, for instance, where you've got only 0.02 of a second to process input, tick the game clock, do collision detection, calculate the scene content, construct a scenegraph and render it to the screen, you'll take any little speed up you can get! As others have said, measure, and don't micro-optimise like this unnecessarily.

Trevor Tippins 2010-10-17 10:01:47

Answer 4

+1 A:

In general (but it may depend greatly on compiler and platform implementation), passing parameters mean writing them onto the stack which you would not need with global variable.

That said, global variable may mean include page refresh in the MMU or memory controller whereas the stack may be located in much faster memory available to the processor...

Sorry, no good answer for a general question like this, just measure it (and try different scenarios too)

Matthieu 2010-10-17 09:57:58

+1, even if with some calling conventions/architectures arguments do not even have to be written to the stack.

David Rodríguez - dribeas 2010-10-17 12:06:47

Answer 5

A:

But you have 'spagetti code', when you often use global variables.

miksayer 2010-10-17 10:04:46

@miksayer I think "spaghetti code" relates to using goto, not global variables.

Tomasz Łazarowicz 2010-10-17 10:59:44

The coupling diagram for a piece of code that uses global data sure does look like spaghetti -- even if the program flow doesn't.

Amardeep 2010-10-17 11:44:34

Answer 6

+4 A:

Putting aside the issues of maintainability and correctness, there are basically two factors that will govern performance with regard to globals vs. parameters.

When you make a global you avoid a copy. That's slightly faster. When you pass a parameter by value, it has to be copied so that a function can work on a local copy of it and not damage the caller's copy of the data. At least in theory. Some modern optimizers do pretty tricky things if they identify that your code is well behaved. A function may get automatically inlined, and the compiler may notice that the function doesn't do anything to the parameters, and just optimise away any copying.

When you make a global, you are lying to the cache. When you have all of your variables neatly contained in your function, and a few parameters, the data will tend to all be in one place. Some of the variables will be in registers, and some will probably be in cache right away because they are right 'next to' each other. Using a lot of global variables is basically pathological behavior for the cache. There is no guarantee that various globals will be used by the same functions. Location has no obvious correlation with usage. Perhaps you have a small enough working set that it makes no difference where anything is, and it all winds up in cache.

All of this just adds up to the point made by a poster above me:

When you switch from parameters to global variables, one of three things can happen:
* it runs faster
* it runs the same
* it runs slower
You will have to measure performance to see what's faster in a non-trivial concrete case. This was true in 1996, is true today and is true tomorrow.

Depending on the specific behavior of your exact compiler, and precise details of the hardware that you use to run your code, it's possible that global variables could be a very slight performance win in some cases. That possibility may be worth trying it on some code that runs too slow as an experiment. It's probably not worth dedicating yourself to, as the answer of your experiment could change tomorrow. So, the right answer is almost always to go with "correct" design patterns and avoid the uglier design. Look for better algorithms, more efficient data structures, etc., before intentionally trying to spaghettify your project. Much better payoff in the long run.

And, aside from the dev time vs user time argument, I'll add the dev time vs. Moore's time argument. If you assume Moore's law will make computers something like half again as fast every year, then for the sake of a simple round number, we can assume that progress happens in a steady 1% progress per week. IF you are looking at a microoptimisation that may improve things like 1%, and it will add a week to the project from complicating things, then just taking the week off will have the same effect on average run times for your users.

wrosecrans 2010-10-17 10:38:42

Great point about chasing a quickly outmoded gain.

Amardeep 2010-10-17 11:42:38

Answer 7

+1 A:

It was faster when we had <100mhz processors. Now that that processors are 100x faster this 'problem' is 100x less significant. It wasnt a big deal then, it was a big deal when you did it in assembly and had no (good) optimizer.

Says the guy who programmed on a 3mhz processor. Yes you read that right and 64k was NOT enough.

acidzombie24 2010-10-17 10:53:03

64k wouldnt be ... 640k probably would of been.

John Nicholas 2010-10-17 11:17:48

@John Nicholas: Yeah it was. I heard a few cases where 640k wasnt enough BUT thats if you dont include attached readable media (like a cartridge or a floopy). In that case everything was fine

acidzombie24 2010-10-17 11:24:08

3MHz? You had it good.

JUST MY correct OPINION 2010-10-17 12:33:48

Answer 8

+2 A:

Well, if you are considering using global parameters instead of parameter passing, that could mean that you have a long chain of methods/functions that you have to pass that parameter down. It that is the case, you really WILL save CPU cycles by switching from parameter to global variable.

So, guys that say that it depends, I guess that they are plain wrong. Even with REGISTER parameter passing, there will still be MORE cpu cycles and MORE overhead for pushing the parameters down to the callee.

HOWEVER - I never do that. CPUs are superior now, and at times when there were 12Mhz 8086s that could be the issue. Nowadays, if you don't write embedded or super-turbo-charged performance code, stick to that which looks good in code, which doesn't break code logic, and thrives to be modular.

And lastly, leave machine language code generation to compiler - guys who designed it are best at knowing how their baby performs and will make your code run at its best.

Daniel Mošmondor 2010-10-17 11:05:45

Answer 9

+14 A:

Everyone has already given the appropriate caveat answers about this being platform and program specific, needing to actually measure timings, etc. So, with that all said already, let me answer your question directly for the specific case of game programming on x86 and PowerPC.

In 1996, there were certain cases where pushing parameters onto the stack took extra instructions and could cause a brief stall inside the Intel CPU pipeline. In those cases there could be a very small speedup from avoiding parameter passing altogether and reading data from literal addresses.

This isn't true any more on the x86 or on the PowerPC used in most game consoles. Using globals is usually slower than passing parameters for two reasons:

Parameter passing is implemented better now. Modern CPUs pass their parameters in registers, so reading a value from a function's parameter list is as fast (or even faster) than a memory load operation. The x86 uses register shadowing and store forwarding, so what looks like shuffling data onto the stack and back can actually be a simple register move.
Data cache latency far outweighs CPU clock speed in most performance considerations. The stack, being heavily used, is almost always in cache. Loading from an arbitrary global address can cause a cache miss, which is a huge penalty as the memory controller has to go and fetch the data from main RAM. ("Huge" here is 600 cycles or more.)

Crashworks 2010-10-17 11:23:56

+1 for identifying cache locality as a major culprit when using global variables.

Amardeep 2010-10-17 11:38:47

When optimizing on the PlayStation 2 I almost never had to do more than fix cache misses to get my functions running quickly enough. Instruction count was spare change compared to the latency of a cache miss.

Crashworks 2010-10-17 11:43:59

Answer 10

+3 A:

Perhaps a micro optimisation, and would probably be wiped out by optimisations your compiler could generate without resort to such practices. In fact the use of globals may even inhibit some compiler optimisations. Reliable and maintainable code would generally be of greater value, and globals are not conducive to that.

Using globals to replace function parameters renders all such functions non-reentrant, which may be a problem if multi-threading is used - not a common practice in game development in 1996, but more common with the advent of multi-core processors. It also precludes recursion, although that is probably less of an issue since recursion has its own issues.

In any significant body of code, there is likely to be more mileage in higher-level optimisation of algorithms and data structures. Moreover there are options open to you other than global variables that avoid parameter passing, most especially C++ class-member variables.

If the habitual use of global variables in your code makes a measurable or useful difference to its performance, I would question the design first.

For a discussion of the problems inherent in global variables and some ways to avoid them see A Pox on Globals by Jack Gannsle. The article relates to embedded systems development, but is generally applicable; its just that some embedded systems developers think they have good reason to use globals, probably for all the same misguided reasons used to justify it in game development.

Clifford 2010-10-17 11:25:38

Answer 11

+27 A:

Short answer - No, good programmers make code go faster by knowing and using the appropriate tools for the job, and then optimizing in a methodical way where their code does not meet their requirements.

Longer answer - This article, which in my opinion is not especially well-written, is not in any case general advice on program speedup but '15 ways to do faster blits'. Extrapolating this to the general case is missing the writer's point, whatever you think of the merits of the article.

If I was looking for performance advice, I would place zero credence in an article that does not identify or show a single concrete code change to support the assertions in the sample code, and without suggesting that measuring the code might be a good idea. If you are not going to show how to make the code better, why include it?

Some of the advice is years out of date - FAR pointers stopped being an issue on the PC a long time ago.

A serious game developer (or any other professional programmer, for that matter) would have a good laugh about advice like this:

You can either take out the assert's completely, or you can just add a #define NDEBUG when you compile the final version.

My advice to you, if you really wish to evaluate the merit of any of these 15 tips, and since the article is 14 years old, would be to compile the code in a modern compiler (Visual C++ 10 say) and try to identify any area where using a global variable (or any of the other tips) would make it faster.

[Just joking - my real advice would be to ignore this article completely and ask specific performance questions on Stack Overflow as you hit issues in your work that you cannot resolve. That way the answers you get will be peer reviewed, supported by example code or good external evidence, and current.]

Steve Townsend 2010-10-17 12:36:31

+1 for actually and literally taking the pain to read and analyse the cited article and addressing the question in the most concrete and structured way so far.

Peter G. 2010-10-17 13:41:42

You're under some bizarre assumption that every computer and compiler nowadays is "modern". Using globals on embedded systems is often required due to incredibly small stack sizes and code space. Speed-wise, it can also help when trying to handle data in real-time (as is the goal of most embedded).

Nick T 2010-10-17 17:07:31

@Nick T, well, and perhaps I should mention this in my question, I am developing for an Android smart phone.

BeeBand 2010-10-17 18:58:31

@Nick T - there is an interesting and very different discussion to be had on that topic. The article here was Windows-specific, c.1996 and my response is intended to be applicable in that context.

Steve Townsend 2010-10-17 19:42:53

Answer 12

+1 A:

I see a lot of theoretical answers, but no practical advice for your scenario. What I'm guessing is that you have a large number of parameters to pass down through a number of function calls, and you're worried about accumulated overhead from many levels of call frames and many parameters at each level. Otherwise your concern is completely unfounded.

If this is your scenario, you should probably put all of the parameters in a "context" structure and pass a pointer to that structure. This will ensure data locality, and makes it so you don't have to pass more than one argument (the pointer) at each function call.

Parameters accessed this way are slightly more expensive to access than true function arguments (you need an extra register to hold the pointer to the base of the structure, as opposed to the frame pointer which would serve this purpose with function arguments), and individually (but probably not with cache effects factored in) more expensive to access than global variables in normal, non-PIC code. However, if your code is in a shared library/DLL using position independent code, the cost of accessing parameters passed by pointer to struct is cheaper than accessing a global variable and identical to accessing static variables, due to GOT and GOT-relative addressing. This is another reason never to use global variables for parameter passing: if you may eventually put your code in a shared library/DLL, any possible performance benefits will suddenly backfire!

R.. 2010-10-17 13:43:51

Answer 13

+1 A:

Like everything else: yes and no. There is no one answer because it depends on context.

Counterpoints:

Imagine programming on Itanium where you have hundreds of registers. You can put quite a few globals into those, which will be faster than the typical way globals are implemented in C (some static address (although they might just hardcode the globals into instructions if they are word length)). Even if the globals are in cache the whole time, registers may still be faster.
In Java, overuse of globals (statics) can decrease performance because of initialization locks that have to be done. If 10 classes want to access some static class, they all have to wait for that class to finish initializing its static fields, which can take anywhere form no time up to forever.

In any case, global state is just bad practice, it raises code complexity. Well designed code is naturally fast enough 99.9% of the time. It seems like newer languages are removing global state all together. E removes global state because it violates their security model. Haskell removes state all together. The fact that Haskell exists and has implementations that outperform most other languages is proof enough for me that I will never use globals again.

Also, in the near future, when we all have hundreds of cores, global state isn't really going to help much.

Longpoke 2010-10-17 15:42:56

Answer 14

+1 A:

It might still be true, under some circumstances. A global variable might be as fast as a pointer to a variable, where its pointer is stored in/passed through registers only. So, it is a question about the count of registers, you can use.

To speed-optimize a function call, you could do several other things, that might perform better with global-variable-hacks:

Minimize the count of local variables in the function to a few (explicit) register variables.
Minimize the count of parameters of the function, i.e. by using pointers to structures instead of using the same parameter-constellations in functions that call each other.
Make the function "naked", that means that it does not use the stack at all.
Use "proper-tail-calls" (does neither work with java/-bytecode nor java-/ecma-script)
If there is no better way, hack yourself sth like TABLES_NEXT_TO_CODE, which locates your global variables next to the function code. In functional languages this is a backend-optimization that uses the function-pointer as data-pointer, too; but as long as you do not program in a functional language, you only need to locate those variables beside those used by the function. Then again, you only want this to remove the stack-handling from your function. If your compiler generates assembler code that handles the stack, then there is no point in doing this, you could use pointers instead.

I've found this "gcc attribute overview": http://www.ohse.de/uwe/articles/gcc-attributes.html

and I can give you these tags for googling: - Proper Tail Call (it is mostly relevant to imperative backends of functional languages) - TABLES_NEXT_TO_CODE (it is mostly relevant to Haskell and LLVM)

comonad 2010-10-17 19:53:13

ansaurus

tags:

views:

answers:

Do global variables mean faster code?

related questions