views:

396

answers:

12

I've been programming C, mainly in an embedded environment, for years now and have a perfectly good mental model of pointers - I don't have to explicitly think about how to use them, am 100% comfortable with pointer arithmetic, arrays of pointers, pointers-to-pointers etc.

I've written very little C++ and really don't have a good way of thinking about references. I've been advised in the past to "think of them as pointers that can't be NULL" but this question shows that that is far from the full story.

So for more experienced C++ programmers - how do you think of references? Do you think of them as a special sort of pointer, or as their own thing entirely? What's a good way for a C programmer to get their head round the concept?

A: 

I think of it as a pointer container.

Anders Rune Jensen
+2  A: 

How about "pointers that can't be NULL and can't be changed after initialisation". Also, they have no size by themselves (because they have no identity of themselves).

Greg Hewgill
They do have a size, "under the hood" they are just a pointer (with some restrictions placed on them), hence, they are 4 bytes in size.
Grant Peters
Correction, they are the size of a pointer (too used to working on 32-bit systems :P)
Grant Peters
@Grant: not necessarily. For instance "int a = 1; int " probably results in no storage being used on the stack for b. The code will just compile such that all uses of b access the same register or stack slot as uses of a. In this situation b is not "just a pointer under the hood", it's a pure alias. Of course the defined behaviour is the same however it's implemented. So it's fine to imagine that it's a pointer, but it might not actually be one.
Steve Jessop
Can't be NULL? How about: int *p = NULL; int int myInt = ref; // BOOM!
Jim Buck
Rather than "can't be NULL" I probably should have said "can be assumed to be not NULL". While it's certainly possible to create a reference to NULL, this is not technically legal C++ and a function that is passed a reference argument can reasonably assume that it is *not* NULL.
Greg Hewgill
+11  A: 
Artem Barger
I was going to say different names for the same object. +1 for being faster ;-).
fco.javier.sanz
Fine, but why use an "alias" when a pointer will work, what are the drawbacks of using references? Should they be used only for specific reasons, and if so why? Apologies but your answer is "information lite"
Binary Worrier
Because you can't change the address that a reference points to.Also it acts just like a regular variable.
the_drow
@BinaryWorrier: By using a reference you also remove the possibility that the reference "won't be set".
Richard Corden
@Richard: Thanks, but personally I know what references are, and I know when/where they should be used. I was trying to get the poster to elucidate their thoughts beyond "references as an alias for main object", a pointer can also be seen "as an alias for main object". Personally I don't find this answer useful, and am surprised that the community at large thinks it's the best answer. Just goes to show . . .
Binary Worrier
@Binary the answer for that question could not be very useful, since the question asks for "What a good way for me to think of reference". So as for me reference is a way to rename/alias the same variable, because once you did any manipulation will affect the original in exact same way without need to dereference it or do some extra manipulation like in case with pointers which you mentioned.
Artem Barger
@Artem: I see your point, I really do . . . it's just that there's SO MUCH MORE that can be said on the subject. Howeer, who am I to argue with the community :)
Binary Worrier
@Binary: I'm glad you understand my point and I don't even dare to argue with the fact that there is much more could be said in the context. But still that my way of thinking about references and I've posted some elaboration, why I've get used to think like that.
Artem Barger
+2  A: 

I think of the reference as being the object it refers to. You access the object using . symantecs (as opposed to ->), re-enforcing this idea for me.

DanDan
+2  A: 

I think your mental model of pointers, and then a list of all the edge cases you've encountered, is the best way.

Those who don't get pointers are going to fare far worse.

Incidentally, they can be NULL or any other non-accessible memory location (it just takes effort):

char* test = "aha";
char& ok = *test;
test = NULL;
char& bad = *test;
Will
It's even easier than that: `char `
Greg Hewgill
It's still not legal for a reference to be NULL, though.
Martin B
"legal" as in badly formed, but if you're making an API that is at all sensitive to attack, or just robust, you can't assume that parameters passed by reference cannot be NULL - you really have check
Will
haha, beat me to it. 2 other ways of setting of getting the NULL reference are: charand union { char *ptr; char } value; value.ptr = NULL;It's fun breaking code!
Grant Peters
I don't know whether to mark this up or down :o
280Z28
Dereferencing a null pointer is supposed to give you undefined behaviour.
Indeera
@Indeera: in deed, you can't really check for 'null' as such, since that's just one possibly illegal value. Any memory address not mapped to accessible RAM will cause the program to crash.
Will
"you can't assume that parameters passed by reference cannot be NULL - you really have check". I think this is wrong. Checking for null catches some errors, but is still fragile. Are you also going to check for `*(char *)(1)`, `*(char*)(2)`, etc? To be robust against attack you need to use handles, not pointers or references, and validate them by lookup at the boundary of trust
Steve Jessop
@onebyone.livejournal.com: yeap, Windows has the flawed IsBadWritePtr http://blogs.msdn.com/oldnewthing/archive/2006/09/27/773741.aspx - what does mainstream unix have, and is it better?
Will
Don't know, but I doubt there's anything ideal. Trying to use the hardware to tell you whether a pointer you've been given is really a Foo* (and not, say, a reinterpret_cast Bar*, or a null, or `(char *)-12`) is just impossible. The low-efficiency but actually-works solution is to store all valid external handles of a particular type in a set, and lookup your arguments in that. And unmap them from the virtual space of untrusted code. If you can't trust code give you a non-NULL reference, perhaps it should be untrusted. null checks provide debugging aid and *some* resilience, but not security.
Steve Jessop
yeap I did that when making symbian secure symbian servers. Only sending POD on external interfaces and having a validate_bytes(start,len) (which didn't validate the page) would be good
Will
The Symbian "process is the unit of trust" model always seemed pretty sound to me. Both Google Chrome and Microsoft's Gazelle apply similar principles to the problem of handling downloaded content using dodgy bug-ridden parsers and plugins. Ultimately it's a question of how big a class of programming error/malice you're trying to make yourself robust against - because on Symbian even the user doesn't have full trust, they aimed high to start with, then APIs do basically no input validation except at process boundaries.
Steve Jessop
the message passing in symbian would give errors instead of crashing a server if the client tried to send bad pointers. All messages were upto four (or was it five) ints or strings; the kernel did the copying on behalf of the server; pretty robust for clients, but puts servers in a position of authority (a server can panic any client etc)
Will
+5  A: 

I'm not all too fond of the "ever-valid" view, as references can become invalid, e.g.

int* p = new int(100);
int& ref = *p;

delete p; // oops - ref now references garbage

So, I think of references as non-rebindable (that is, you can't change the target of a reference once it's initialized) pointers with syntactic sugar to help me get rid of the "->" pointer syntax.

Kim Gräsman
The common answer to this is that it takes invoking undefined behavior to invalidate a reference while a pointer can be invalidated just so. However, I see your point. In practice, this makes very little difference.
sbi
I do agree. I tend just to _use_ them as ever-valid handles: in a local scope, or as function arguments.
xtofl
@sbi: is the above really undefined behavior? I can see that it's not very clever, but undefined? Thanks.
Kim Gräsman
@Kim: 8.3.2/4 says "A reference shall be initialized to refer to a valid object or function." But I agree that this only seems to covers initialization. Maybe I'm wrong. (I'm probably the opposite of a language lawyer.)
sbi
Good question. 8.3.2 says that a reference must be initialized to a valid object or function, which you do. Can't find a statement whether or not references must always be valid, or if it's good enough to be valid only when an expression is evaluated which contains them.
Steve Jessop
Thanks guys! I guess as soon as we attempt to use ref, it will be undefined behavior, though. Not sure if that's spec:ed, however.
Kim Gräsman
@xtofl: I didn't mean to pick on you specifically -- the reason I brought it up is because I've stumbled over numerous such constructs in real code, where people have been confused about the actual power of references, and used them as a magic pill to solve lifetime issues.
Kim Gräsman
I'd say you also just use references to point to variables on the stack, not the heap.
Neil
+5  A: 

In general you just don't think about references. You use references in every function unless you have a specific need for calling by value or pointer magic.

References are essentially pointers that always point to the same thing. A reference doesn't need to be dereferenced, and can instead be accessed as a normal variable. That's pretty much all that there is to it. You use pointers when you need to do pointer arithmetic or change what the pointer points to, and references for just about everything else.

Markus Koivisto
Except for small types, where copying the address of the referred-to object is more expensive than copying the object. Small types is about to mean: everything smaller than 64 bits.
xtofl
Yes. I wouldn't advocate using references for primitive types in functions. But then again, that's not a case where you would usually use a pointer either, and if you are doing so, you should know what you are doing.
Markus Koivisto
Actually, if one is to be pedantic there's some cases where passing types larger than 64 bits is important. For example a lot of todays specialized hardware have very wide registers ( 128 bit and above are not uncommon ). As an example, on game consoles you usually pass your vectors by value to ensure they stay in register and then use SIMD instructions to operate on these wide registers.
Ylisar
+1  A: 

One way to think about them is as importing another name for an object from a possibly different scope.

For instance : Obj o; Obj& r = o; There is really little difference between semantics of o and r.

The major one seems that the compiler watches the scope of o for calling the destructor.

EFraim
+5  A: 

For me, when I see a pointer in code (as a local variable in a function or a member on a class), I have to think about

  1. Is the pointer null, or is it valid
  2. Who created the object it points to (is it me?, have I done it yet?)
  3. Who is responsible for deleting the object
  4. Does it always point to the same object

I don't have to think about any of that stuff if it's a reference, it's somebody else's problem (i.e. think of a reference as an SEP Field for a pointer)

P.S. Yes, it's probably still my problem, just not right now

Binary Worrier
Nice way to put it. +1 :)
jalf
A: 

From a syntactic POV, a reference is an alias for an existing object. From a semantic POV, a reference behaves like a pointer with a few problems (invalidation, ownership etc.) removed and an object-like syntax added. From a practical POV, prefer references unless you have the need to say "no object". (Resource ownership isn't a reason to prefer pointers, as this should be done using smart pointers.)

Update: Here's one additional difference between references and pointers which I forgot about: A temporary object (an rvalue) bound to a const reference will have its life-time extended to the life of the reference:

const std::string& result = function_returning_a_string();

Here, the temporary returned by the function is bound to result and will not cease to exist at the end of the expression, but will exist until result dies. This is nice, because in the absence of revalue references and overloading based on them (as is to come in C++1x), this allows you to get rid of one unnecessary copy in the above example.

This is a rule introduced especially for const references and there's no way to achieve this with pointers.

sbi
+3  A: 

References are pointer-consts with different syntax. ie. the reference T& is pretty much T * const as in, the pointer cannot be changed. The content of both is identical - a memory address of a T - and neither can be changed.

Then apart from that pretty much the only difference is the syntax: . for references and -> and * for pointer.

That's it really - references ARE pointers, just with different syntax (and they're const).

A: 

If you use linux, you can think of references as hard links and pointers as symbolic links (symlinks). Hard link is just another name for a file. The file gets "deleted" when all hard links to this file are removed.

Same about references. Just substitue "hard link" with "reference" and "file" with "value" (or probably "memory location"?).

A variable gets destroyed when all references are gone out of scope.

You can't create a hard link to a nonexistent file. Similary, it's not possible to create a reference to nothing.

However you can create a symlink to a nonexistent file. Much like an uninitialized pointer. Actually uninitialized pointers do point to some random locations (correct me if I'm wrong). But what I mean is that you are not supposed to use them :)

presario