views:

183

answers:

6

I'm learning C++ at the moment and though I grasp the concept of pointers and references for the better part, some things are unclear. Say I have the following code (assume Rectangle is valid, the actual code is not important):

#include <iostream>
#include "Rectangle.h"

void changestuff(Rectangle& rec);

int main()
{
    Rectangle rect;
    rect.set_x(50);
    rect.set_y(75);
    std::cout << "x,y: " << rect.get_x() << rect.get_y() << sizeof(rect) << std::endl;
    changestuff(rect);

    std::cout << "x,y: " << rect.get_x() << rect.get_y() << std::endl;
    Rectangle* rectTwo = new Rectangle();
    rectTwo->set_x(15);
    rectTwo->set_y(30);
    std::cout << "x,y: " << rect.get_x() << rect.get_y() << std::endl;
    changestuff(*rectTwo);
    std::cout << "x,y: " << rect.get_x() << rect.get_y() << std::endl;
    std::cout << rectTwo << std::endl;
}

void changestuff(Rectangle& rec)
{
    rec.set_x(10);
    rec.set_y(11);
}

Now, the actual Rectangle object isn't passed, merely a reference to it; it's address. Why should I use the 2nd method over the first one? Why can't I pass rectTwo to changestuff, but *rectTwo? In what way does rectTwo differ from rect?

+1  A: 

rectTwo differs from rect in that rect is an instance of a Rectangle on the stack and rectTwo is the address of a Rectangle on the heap. If you pass a Rectangle by value, a copy of it is made, and you will not be able to make any changes that exist outside of the scope of changestuff().

Passing it by reference means that changestuff will have the memory address of the Rectangle instance itself, and changes are not limited to the scope of changestuff (because neither is the Rectangle).

Edit: your comment made the question more clear. Generally, a reference is safer than a pointer.

From Wikipedia:

It is not possible to refer directly to a reference object after it is defined; any occurrence of its name refers directly to the object it references.

Once a reference is created, it cannot be later made to reference another object; it cannot be reseated. This is often done with pointers.

References cannot be null, whereas pointers can; every reference refers to some object, although it may or may not be valid.

References cannot be uninitialized. Because it is impossible to reinitialize a reference, they must be initialized as soon as they are created. In particular, local and global variables must be initialized where they are defined, and references which are data members of class instances must be initialized in the initializer list of the class's constructor.

Additionally, objects allocated on the heap can lead to memory leaks, whereas objects allocated on the stack will not.

So, use pointers when they are necessary, and references otherwise.

danben
Yes, that's why I pass the address to changestuff in both cases, thus the values at the addresses get changed and no local copy is made. How and why should I choose what method to pick then?
Oxymoron
This is completely correct, but I believe he was asking what would be the difference between passing an object by reference to a function or using a pointer to it.
Casey
@Casey - right, I didn't realize at first. Answer updated.
danben
@Oxymoron: As far as the end result, you have an object of some kind you need to pass into a function to be modified, it doesn't matter which one you pick. In my programming, I try to pass by reference wherever possible. I use pointers where I need to (like interfacing with other api's and such).
Casey
+2  A: 

You need to understand that references are NOT pointers. They ,may be implemented using them (or they may not) but a reference in C++ is a completely different beast to a pointer.

That being said, any function that takes a reference can be used with pointers simply by dereferencing them (and vice versa). Given:

class A {};
void f1( A & a ) {}     // parameter is reference
void f2( A * a ) {}     // parameter is pointer

you can say:

A a;
f1( a )
f2 ( &a );

and:

A * p = new A;
f1( *a )
f2 ( a );

Which should you use when? Well that comes down to experience, but general good practice is:

  • prefer to allocate objects automatically on the stack rather than using new whenever possible
  • pass objects using references (preferably const references) whenever possible
anon
You have a small typo in your `f2` definition.
Peter Alexander
Automatically on the stack would the first method (A a), avoiding new?
Oxymoron
@Oxymoron Yes, that's right.
anon
Alright, apart from not having to delete objects/free memory (or rather risking forgetting), is there any physical benefit? Stack faster to access then heap for example?
Oxymoron
@Oxymoron No there isn't. And when you are programming, performance should be among the last of your concerns - fast code that doesn't work and can't be maintained is of no use to anyone.
anon
When you put an object on the stack, getting the storage for it is just a stack pointer move. When you allocate from the heap, the allocator has to find a suitable block and then allocate it. This can take significantly longer. When a stack frame is cleaned up, there is no bookkeeping involved. Returning a heap block does involve significant overhead. As well, in a multi-threaded program, each thread has its own stack, and there is no synchronization required for a stack variable. But it multiple threads are using the same allocator, it is necessary to synchronize access to that allocator.
Permaquid
+4  A: 

There really isn't any reason you can't. In C, you only had pointers. C++ introduces references and it is usually the preferred way in C++ is to pass by reference. It produces cleaner code that is syntactically simpler.

Let's take your code and add a new function to it:

#include <iostream>
#include "Rectangle.h"

void changestuff(Rectangle& rec);
void changestuffbyPtr(Rectangle* rec);

int main()
{
    Rectangle rect;
    rect.set_x(50);
    rect.set_y(75);
    std::cout << "x,y: " << rect.get_x() << rect.get_y() << sizeof(rect) << std::endl;
    changestuff(rect);
    std::cout << "x,y: " << rect.get_x() << rect.get_y() << std::endl;

    changestuffbyPtr(&rect);
    std::cout << "x,y: " << rect.get_x() << rect.get_y() << std::endl;

    Rectangle* rectTwo = new Rectangle();
    rectTwo->set_x(15);
    rectTwo->set_y(30);
    std::cout << "x,y: " << rectTwo->get_x() << rectTwo->get_y() << std::endl;
    changestuff(*rectTwo);
    std::cout << "x,y: " << rectTwo->get_x() << rectTwo->get_y() << std::endl;

    changestuffbyPtr(rectTwo);
    std::cout << "x,y: " << rectTwo->get_x() << rectTwo->get_y() << std::endl;
    std::cout << rectTwo << std::endl;
}

void changestuff(Rectangle& rec)
{
    rec.set_x(10);
    rec.set_y(11);
}

void changestuffbyPtr(Rectangle* rec)
{
    rec->set_x(10);
    rec->set_y(11);
}

Difference between using the stack and heap:

 #include <iostream>
#include "Rectangle.h"

Rectangle* createARect1();
Rectangle* createARect2();

int main()
{
    // this is being created on the stack which because it is being created in main,
    // belongs to the stack for main. This object will be automatically destroyed 
    // when main exits, because the stack that main uses will be destroyed.
    Rectangle rect;

    // rectTwo is being created on the heap. The memory here will *not* be released
    // after main exits (well technically it will be by the operating system)
    Rectangle* rectTwo = new Rectangle();

    // this is going to create a memory leak unless we explicitly call delete on r1.
    Rectangle* r1 = createARectangle();

    // this should cause a compiler warning:
    Rectangle* r2 = createARectangle();
}

Rectangle* createARect1()
{
    // this will be creating a memory leak unless we remember to explicitly delete it:
    Rectangle* r = new Rectangl;
    return r;
}

Rectangle* createARect2()
{
    // this is not allowed, since when the function returns the rect will no longer
    // exist since its stack was destroyed after the function returns:
    Rectangle r;
    return &r;
}

It should also be worth mentioning that a huge difference between pointers and references is that you can not create a reference that is uninitialized. So this perfectly legal:

int *b;

while this is not:

int& b;

A reference has to refer to something. This makes references basically unusable for polymorphic situations, in which you may not know what the pointer is initialized to. For instance:

// let's assume A is some interface:
class A 
{
public:
    void doSomething() = 0;
}

class B : public A
{
public:
    void doSomething() {}
}

class C : public A
{
public:
    void doSomething() {}
}

int main()
{
    // since A contains a pure virtual function, we can't instantiate it. But we can    
    // instantiate B and C
    B* b = new B;
    C* c = new C;

    // or
    A* ab = new B;
    A* ac = new C;

    // but what if we didn't know at compile time which one to create? B or C?
    // we have to use pointers here, since a reference can't point to null or
    // be uninitialized
    A* a1 = 0;
    if (decideWhatToCreate() == CREATE_B)
        a1 = new B;
    else
        a1 = new C;
}
Casey
Alright, but still, when to use Rectangle rect or Rectangle* rect2 = new Rectangle();Why would I want it to be on the heap opposed to on the stack, what benefit does it give me?
Oxymoron
Ah! Now this is the big question. Objects and primitives are created in two places, the Heap or the Stack. The difference is that unless you use the new keyword, everything is on the stack. This becomes a problem when you are passing objects around to functions as you need to be careful. See my recently updated code example for explanation.
Casey
Why does placing it on the stack become a problem? The stack can access the same amount of memory the heap can isn't? And apart from another syntax I can't see anything different in the method you provided. It's semantically exactly the same (from my point of view atm).
Oxymoron
Alright, thanks for the explanation!
Oxymoron
Aaah, that last edit was especially useful. Much better context to relate to :)
Oxymoron
@Casey: Of course a reference can be used polymorphically. The OP's example of `changestuff(Rectangle` could be passed a reference to a `Rectangle`, *or* a reference to an instance of a class *derived* from `Rectangle`.
quamrana
+2  A: 

In C++, objects can be allocated on the heap or on the stack. The stack is valid only locally, that is when you leave the current function, the stack and all contents will be destroyed.

On the contrary, heap-objects (which must be specifically allocated using new) will live as long you don't delete them.

Now the idea is that you a caller should not need to know what a method does (encapsulation), internally. Since the method might actually store and keep the reference you have passed to it, this might be dangerous: If the calling method returns, stack-objects will be destroyed, but the references are kept.

In your simple example, it all doesn't matter too much because the program will end when main() exits anyhow. However, for every program that is just a little more complex, this can lead to serious trouble.

mnemosyn
I think this is the most important point about stack vs heap. If you are going to transfer ownership, you have to use the heap. I would add - use a smart pointer, not a raw pointer.
Permaquid
A: 

Quite a few application domains require the use of pointers. Pointers are needed when you have intimate knowledge about how your memory is layed out. This knowledge could be because you intended the memory to be layed out in a certain way, or because the layout is out of your control. When this is the case you need pointers.

Why would you have manually structured the memory for a certain problem domain ? Well an optimal memory layout for a certain problems are orders of magnitude faster than if you used traditional techniques.

Example domains:

  1. Enterprise Databases.
  2. Kernel design.
  3. Drivers.
  4. General purpose Linear Algebra.
  5. Binary Data serialization.
  6. Slab Memory allocators for transaction processing (web-servers).
  7. Video game engines.
  8. Embedded real-time programming.
  9. Image processing
  10. Unicode Utility functions.
Hassan Syed
A: 

You are right to say that the actual Rectangle object isn't passed, merely a reference to it. In fact you can never 'pass' any object or anything else really. You can only 'pass' a copy of something as a parameter to a function.

The something that you can pass could be a copy of a value, like an int, or a copy of an object, or a copy of a pointer or reference. So, in my mind, passing a copy of either a pointer or a reference is logically the same thing - syntactically its different, hence the parameter being either rect or *rectTwo.

References in C++ are a distinct advantage over C, since it allows the programmer to declare and define operators that look syntactically identical to those that are available for integers.
eg. the form: a=b+c can be used for ints or Rectangles.

This is why you can have changestuff(rect); because the parameter is a reference and a reference to (pointer to) rect is taken automatically. When you have the pointer Rectangle* rectTwo; it is an 'object' in its own right and you can operate on it, eg reassign it or increment it. C++ has chosen to not convert this to a reference to an object, you have to do this manually by 'dereferencing' the pointer to get to the object, which is then automatically converted to a reference. This is what *rectTwo means: dereferencing a pointer.

So, rectTwo is a pointer to a Rectangle, but rect is a rectangle, or a reference to a Rectangle.

quamrana