tags:

views:

298

answers:

8

Is declaring/assigning a variable in a high level language such as c++, an explicit instruction?

e.g. x = 5;

It would be handled by the loader, and treated as state information, correct?

It is not an instruction, but a state object, as opposed to something like a for loop, which is an instruction, which makes it's way to the cpu ?

edit: OK, to clarify a bit more. I am not talking about optimisations. Assume none. I am talking about the end result of a compiled program, in an executable file format. Under the circumstances where the compiler decides not to use the MOV instruction, will the data 5 exist within the executables files data segment, or somewhere else?

Is it possible for the 5 to exist as data without being an instruction, with that data later on being loaded into memory? Or, in general, will x = 5 result in a mov instruction when the program is executed.

+3  A: 

It depends on how it is used.

Generally, the compiler will try to make every line as low-impact as possible.

If this is used in only one place, you can bet your buttons that it is hard coded in the machine code, rather than wasting space on the stack.

If it is used for math or algorithmic operations, and its value may change, space for the variable may be allocated on the stack. Then again, if it used frequently enough, the compiler may just leave it in a register.

The answer is: it depends. Compile it and view the result with a debugger's machine code window.

One possible actual translation:

MOV AX, 5
John Gietzen
but is the variable assignation itself an instruction?
Joshxtothe4
It depends. But, yes, it *could* turn in to a MOV instruction.
John Gietzen
Josh, you could maybe experiment by writing that statement in a listing, compile the listing, and then run a disassembler on the resultant executable, and examine the resulting data.
sheepsimulator
+4  A: 

Are you asking if a variable declaration will translate into an assembly instruction in the same way an add or delete would? If so, then the general answer is there is no direct translation into assembly.

There may be instructions which facilitate the declaration of a variable such as updating the stack pointer to make space for a variable. But there is no x86 assembly instruction which says declare variable of this type on the stack.

Most of my experience is with x86 and amd64 chips so there could be instructions on other processors but I am not aware of them (but would be curious to read about them if they did exist).

Variable assignment is a bit different. The general case of x=5 will translate to some type of assembly instruction (writing the value to a register for instance). However with C++ it's really hard to be specific because there are many optimizations and chip specific settings that could cause a particular line to have no translation in machine code.

JaredPar
I dont mean is it a specific cpu instruction, I just mean after it is compiled, is it a state object, or an instruction that will be processed.
Joshxtothe4
+1  A: 

Well in Assembly language generally the mov instruction will be used to load 5 into the register, stack variable or heap variable that represents x.

But the optimizer could decide to just use this as a constant value (like #define) or it could decide to remove it completely if it's not used and it decides where to put it (register, stack variable, heap)

So, I hope it answers your question but as you can see, assigning a variable in C++ abstracts a ton of things and that's very good!

FYI : I have studied tons of compiler assembly output. In this post, the assignment operation optimized into an assembly OR operation, which is very good : http://stackoverflow.com/questions/931301/which-is-more-readable-c

toto
assuming no optimizer...
Joshxtothe4
Are you into compiler writing? I don't think I can help you with that Josh.
toto
+1  A: 

It depends on the compiler and on its optimizations performed. If it performs dead store elimination, then it may well omit emitting assembler code writing to a variable. Consider

for(.....) i+= n;
i = 1;
return i;

Could easily optimized to this one, since the writings to i will be overwritten by the later assignment anyway

i = 1;
return 1;

And if the assignment of 1 to i wasn't happening and the loop would run m times, the compiler could optimize it to

i += n * m;
return i;

Including optimizing increment of i entirely (and in the previous example too) if i is local and would not change any global state. If i has a volatile qualified type, then the compiler is required to omit these optimizations. It has to do every step as the language specification describes. But even then, different compilers could generate different assembler/instructions depending on the capabilities of the processor targeted.

Johannes Schaub - litb
+1  A: 

In general, you can't know how the compiler is going to translate your C++ instructions into assembly instructions. From a C++ perspective, declaring a variable that includes an assignment (like "int x = 5;") is an instruction. If you step through your program in a debugger, it will stop on that line. But who knows what the compiler will do with it. (The entire variable might be optimized away for all you know.)

Mike Kale
If you step through in a debugger, it may or may not stop on that line, depending on which debugger and what sort of optimizations are applied.
David Thornley
+1  A: 

Exactly how much is done at compile time, link time, load time, and execute time will depend on all sorts of things. An implementation isn't even guaranteed to have anything in memory identifiable as x. The C++ standard describes some memory layout and runtime constraints, but it does that only for specificity: it also defines observable behavior, and explicitly says that an implementation may do as it pleases if all observable behavior matches (the "as-if" rule).

This means that the only reasonable answer would be based on what implementations actually do, and the answer is that it depends.

Why do you want to know whether some particular code line translates into some particular machine code? If you make that clear, maybe we can answer a question that's actually useful to you.

David Thornley
+1  A: 

You didn't mention which type x is. If x is not a pod type and it has a constructor which accepts an int, than

x = 5

may have other effects than 'storing 5 somewhere' as the syntax seems to imply.

#include <iostream>

int i;

class X {
  int _y;
public:
  X(int y) : _y(y) {
    i ++;   // changes a global variable
    std::cout << "got " << y << std::endl;  // does IO
  };
};

Now

X x = 5;

has a completely different meaning. So I would say that, in general, in c++ x = 5 is statement as any other since it can have kind of any side effects (like in my example: changing a global state or doing IO).

beb0s
A: 

If your variable is a primitive type (int, char, etc.):

For a global or static variable, no. This is just an entry in the BSS or DATA segment (depending on if it is initialized or not), no executable code required. Except, of course, if the initializer has to be evaluated at runtime.

For a local variable, if it's not initialized, usually the first one implies an assembly instruction, the others not. That's because the space allocation for them is usually made adding an offset to the stack pointer (in fact, subtracting - the stack grows backwards). When you declare your first int variable, an "ADD SP, 4" is generated; for the second, it's just changed to "ADD SP, 8". This instruction will not be at the place where you declare your variable, but instead at the function begin, because all the stack space for local variables must be allocated there.

If you initialize a local variable at creation, then you will have a MOV instruction to load the value to its location in the stack. This instruction will be at the same place as the declaration, in relation to the rest of the code.

These rules for local variables assume no optimization. One common form of optimization is to use CPU registers as variables, in this case no allocation is needed, but initialization will generate an instruction. Also, sometimes these registers must have their values preserved, so you'll see a PUSH instruction at the begin and a POP at the end of the function.

The rules for objects when no constructor are involved (or the constructor is inlined) are a lot more complicated, but a similar logic applies. When you have a non-inlined constructor, of course you need at least an instruction for its call.

Fabio Ceconello