views:

174

answers:

7

How c++ compiler understands the pointer type? As I know pointer has a size equal to WORD of the OS (32 or 64). So does it store some info in that 32(or 64) bits about type? Just because you can not have a pointer on one type and assign to that pointer another pointer with a different type.

+10  A: 

The compiler knows what type a pointer is because the source code says what type the pointer is:

int* ip;   // ip is a pointer to an int

float* fp; // fp is a pointer to a float

void* vp;  // vp is a pointer to some unknown type; need to cast it to a pointer
           // to an actual type in order to access the pointed-at object
James McNellis
+1 And, of course, you *can* assign a pointer of one type to a pointer of another type, you just have to cast it.
T.J. Crowder
It's an int! Now it's a float! Now it's an int! Now it's an float!
glowcoder
@glowcoder: Relax, man. You're two pointers! (juvenile joke from WoW)
John Dibling
@John: Looked it up, turns out your joke is funny. +1
Cam
+1  A: 

This is part of the syntactic analysis in translating your source code into machine code. In the simplest examples, you can think of it as checking the types on both sides of an assignment:

dest = source
// make sure that type of source == type of dest
AraK
+3  A: 

A pointer is usually just a memory address on x86 based architectures (I don't know about other architectures). The compiler enforces type safety with different pointers at compile time - since it makes no sense to assign a pointer-to-char to a pointer-to-int, for example, especially since the objects pointed to are different sizes (so you'd be grabbing random memory if you accessed them). You can explicitly override this and assign any pointer to any other pointer with a reinterpret_cast<T>, or with other types of cast like static_cast<T> and dynamic_cast<T> (the latter two are generally recommended due to being 'safer' but each have their uses).

So at the machine level a memory address is just a memory address and the CPU will dutifully carry out any accesses or calls on that. However it's dangerous, since you can get types mixed up and possibly not know about it. The compile time checks help avoid that, but there is not usually any information about the actual types stored inside the pointer itself at runtime.

An advantage of using iterators (pointer wrappers provided by the STL) is that many implementations have a lot of additional checks which can be enabled at runtime: like checking you're using the right container, that when you compare them they're the same type of iterator, and so on. This is a major reason to use iterators over pointers - but it's not required by the standard, so check your implementation.

AshleysBrain
Use of reinterpret_cast is not unsafe. It's meaning is well defined and if used correctly is perfectly safe. If used incorrectly it is unsafe but the same can be said for anything that is used incorrectly.
Martin York
Also note there is no requirement for iterators to check that they from the same container when being compared or used in an algorithm and thus provide no more safety than pointers (though certain debug implementations of the STL do provide this functionality to help in debugging, but this functionality should not be relied upon)
Martin York
Good points, edited to try to clarify.
AshleysBrain
Historically, "a pointer is a memory address" has been correct. However, on modern OSes, the closest you get is a virtual memory address, and even that is just an implementation detail.
Jurily
@jurily: Unless we go back to 60's its always been a virtual memory address. But this is irrelevant and transparent to any application code (specifically its a hardware detail).
Martin York
+3  A: 

As James said, the compiler "knows" what type a pointer is because you tell it.

Less flippantly, however, what happens under the covers (in a grossly-simplified explanation) is that the parser, while reading your code, annotates every significant piece of it with the information it needs to check up on and enforce the rules of the language it is recognizing. So given this example code:

int*    ip;
// do some stuff
double* dp = ip;

The compiler is going to do something like this behind the scenes (again in grossly simplified form):

Hmmm... There's this thing called "ip". I'd better make a note that it's an integer pointer. OK, here's this thing called "dp". I'd better make a note that it's a double pointer. OK, now, he wants to assign ip to dp. But... Hang on! ip is an integer and dp is a double. I can't do that!

...compiler vomit on screen...

The reality is simultaneously far simpler than the above (in that the computer doesn't think anything at all -- it's all very mechanical) and far more complicated (in that I've glossed over about a billion details in that mechanical process).

JUST MY correct OPINION
+1. Apologies if my answer sounded flippant; that wasn't my intent.
James McNellis
I didn't see it as maliciously flippant, just as a bit of a joke response.
JUST MY correct OPINION
A: 

size equal to WORD of the OS

CPU word you wanted to say? Best of all do not use word to describe a type as the term is heavily overloaded and depending on reader's background might be easily misunderstood.

Just because you can not have a pointer on one type and assign to that pointer another pointer with a different type.

Depends. Read on: Von Neumann architecture

Dummy00001
+1  A: 

A pointer only holds a memory address, nothing more.

Realize that at the level of assembly (which all C / C++ code is translated to), there isn't really any notion of type the way that there is in a high-level language. ASM instructions all operate on binary values (bytes, words, dwords, etc) without caring too much about whether the program thinks that a given set of bits is an int, a char, or something else.

(Excepting of course for the fact that we have different instructions to operate on integer values vs. floating-point values, but that is not really the point of this discussion.)

So the short answer is that type is completely a compile-time construct and is stored in a symbol table inside the compiler program itself, mapping identifiers to types. In the program being compiled, types do not exist.

danben
A: 

On modern architectures, there is no runtime information on whether a word of memory is a command, a number, a part of string or a pointer. All this information is lost after compilation is completed (although it may still be available in debug symbols). If a word is used as pointer by compiled code, then it must be a pointer - the CPU will not check for you.

Older, more exotic architectures used to maintain this information at run time. Take a look here: http://wapedia.mobi/en/Burroughs_large_systems?p=2

Arkadiy