views:

2569

answers:

22

Suppose that I define some class:

class Pixel {
    public:
      Pixel(){ x=0; y=0;};
      int x;
      int y;
}

Then write some code using it. Why would I do the following?

Pixel p;
p.x = 2;
p.y = 5;

Coming from a Java world I always write:

Pixel* p = new Pixel();
p->x = 2;
p->y = 5;

They basically do the same thing, right? One is on the stack while the other is on the heap, so I'll have to delete it later on. Is there any fundamental difference between the two? Why should I prefer one over the other?

+5  A: 

My gut reaction is just to tell you that this could lead to serious memory leaks. Some situations in which you might be using pointers could lead to confusion about who should be responsible for deleting them. In simple cases such as your example, it's easy enough to see when and where you should call delete, but when you start passing pointers between classes, things can get a little more difficult.

I'd recommend looking into the boost smart pointers library for your pointers.

Eric
+21  A: 

I prefer to use the first method whenever I get the chance because:

  • it's faster
  • I don't have to worry about memory deallocation
  • p will be a valid object for the entire current scope
rpg
+5  A: 

The best reason not to new everything is that you can very deterministic cleanup when things are on the stack. In the case of Pixel's this is not so obvious, but in the case of say a file, this becomes advantageous:

  {   // block of code that uses file
      File aFile("file.txt");
      ...
  }    // File destructor fires when file goes out of scope, closing the file
  aFile // can't access outside of scope (compiler error)

In the case of newing a file, you would have to remember to delete it to get the same behavior. Seems like a simple issue in the above case. Consider more complex code, however, such as storing the pointers into a data structure. What if you pass that data structure to another piece of code? Who is responsible for the cleanup. Who would close all your files?

When you don't new everything, the resources are just cleaned up by the destructor when the variable goes out of scope. So you can have greater confidence that resources are successfully cleaned up.

This concept is known as RAII -- Resource Allocation Is Initialization and it can drastically improve your ability to deal with resource acquisition and disposal.

Doug T.
+11  A: 

"Why not use pointers for everything in C++"

One simple answer - because it becomes a huge problem managing the memory - allocating and deleting/freeing.

Automatic/stack objects remove some of the busy work of that.

that is just the first thing that I would say about the question.

Tim
+3  A: 

I'd say it's a lot about a matter of taste. If you create an interface allowing methods to take pointers instead of references, you are allowing the caller to pass in nil. Since you allow the user to pass in nil, the user will pass in nil.

Since you have to ask yourself "What happens if this parameter is nil?", you have to code more defensively, taking care of null checks all the time. This speaks for using references.

However, sometimes you really want to be able to pass in nil and then references are out of the question :) Pointers give you greater flexibility and allow you to be more lazy, which is really good. Never allocate until know you have to allocate!

Magnus Skog
he wasn't referring to function arguments but instead was talking about where things are allocated (heap vs stack). He noted that java just all Objects on the heap (I've heard of some clever trickery in modern versions to put some objects on stack automatically).
Evan Teran
I think you're answering a different question about pointers vs. references; rather than the OP's question about stack-based or heap-based objects.
saw-lau
Evan and saw-lau. You're right. I was trigger happy :)
Magnus Skog
+21  A: 

Logically they do the same thing -- except for cleanup. Just the example code you've written has a memory leak in the pointer case because that memory is not released.

Coming from a Java background, you may not be completely prepared for how much of C++ revolves around keeping track of what has been allocated and who is responsible for freeing it.

By using stack variables when appropriate, you don't have to worry about freeing that variable, it goes away with the stack frame.

Obviously, if you're super careful, you can always allocate on the heap and free manually, but part of good software engineering is to build things in such a way that they can't break, rather than trusting your super-human programmer-fu to never make a mistake.

Clyde
+3  A: 

Object lifetime. When you want the lifetime of your object to exceed the lifetime of the current scope, you must use the heap.

If on the other hand, you don't need the variable beyond the current scope, declare it on the stack. It will automatically be destroyed when it goes out of scope. Just be careful passing its address around.

Matt Brunell
+9  A: 

The code:

Pixel p;
p.x = 2;
p.y = 5;

does no dynamic allocation of memory - there is no searching for free memory, no updating of memory usage, nothing. It is totally free. The compiler reserves space on the stack for the variable at compile time - it works out have much space to reserve and creates a single opcode to move the stack pointer the required amount.

Using new requires all that memory management overhead.

The question then becomes - do you want to use stack space or heap space for your data. Stack (or local) variables like 'p' require no dereferencing whereas using new adds a layer of indirection.

Skizz

Skizz
+124  A: 

Yes, one is on the stack, the other on the heap. There are two important differences:

  • First, the obvious, and less important one: Heap allocations are slow. Stack allocations are fast. And if it is on the stack, you don't have the extra layer of pointer indirection, which also speeds up access to the class.
  • Second, and much more important is RAII. Because the stack-allocated version is automatically cleaned up, it is useful. Its destructor is automatically called, which allows you to guarantee that any resources allocated by the class get cleaned up. This is essentialy how you avoid memory leaks in C++. You avoid them by never calling delete yourself, instead wrapping it in stack-allocated objects which call delete internally, typicaly in their destructor. If you attempt to manually keep track of all allocations, and call delete at the right times, I guarantee you that you'll have at least a memory leak per 100 lines of code.

As a small example, consider this code:

class Pixel {
public:
  Pixel(){ x=0; y=0;};
  int x;
  int y;
};

void foo() {
  Pixel* p = new Pixel();
  p->x = 2;
  p->y = 5;

  bar();

  delete p;
}

Pretty innocent code, right? We create a pixel, then we call some unrelated function, and then we delete the pixel. Is there a memory leak?

And the answer is "possibly". What happens if bar throws an exception? delete never gets called, the pixel is never deleted, and we leak memory. Now consider this:

void foo() {
  Pixel p;
  p.x = 2;
  p.y = 5;

  bar();
}

This won't leak memory. Of course in this simple case, everything is on the stack, so it gets cleaned up automatically, but even if the Pixel class had made a dynamic allocation internally, that wouldn't leak either. The Pixel class would simply be given a destructor that deletes it, and this destructor would be called no matter how we leave the foo function. Even if we leave it because bar threw an exception. The following, slightly contrived example shows this:

class Pixel {
public:
  Pixel(){ x=new int(0); y=new int(0);};
  int* x;
  int* y;

  ~Pixel() {
    delete x;
    delete y;
  }
};

void foo() {
  Pixel p;
  *p.x = 2;
  *p.y = 5;

  bar();
}

The Pixel class now internally allocates some heap memory, but its destructor takes care of cleaning it up, so when using the class, we don't have to worry about it. (I should probably mention that the last example here is simplified a lot, in order to show the general principle. If we were to actually use this class, it contains several possible errors too. If the allocation of y fails, x never gets freed, and if the Pixel gets copied, we end up with both instances trying to delete the same data. So take the final example here with a grain of salt. Real-world code is a bit trickier, but it shows the general idea)

Of course the same technique can be extended to other resources than memory allocations. For example it can be used to guarantee that files or database connections are closed after use, or that synchronization locks for your threading code are released.

jalf
+1. Although, 1leak/100loc is too much. Maybe 1 per 1000 lines of code.
Milan Babuškov
+1 for being the first to mention RAII (that I saw anyways)
rmeador
@Milan: In the face of exceptions I'd say 100 is probably closer than 1000.
mghie
Yeah, you'll probably be able to write the first 500 lines with no leaks. And then you add another 100 lines, which contains 6 different ways to leak the same data, all in the same function. Of course, I haven't measured this, but it sounded good. :)
jalf
I'm sure you meant that to be Pixel p; in the second example rather thanPixel* p
dominic hamon
yeah, just found and fixed that :)
jalf
+1 wow. I understand it. Nice jalf! :)
Zack
+1 very nice and simple!
Secko
Shame for that 200 rep limit, huh? :O
GMan
Instead of all this i thought using auto_ptr would be easy.
Uday
What do you mean "instead"? auto_ptr is an example of RAII in action. You declare your auto_ptr on the stack, and it internally manages the memory allocation you made it responsible for, making sure it is deleted when the auto_ptr goes out of scope. :)
jalf
Good explanation. The best part is, your third example contains a memory leak. If x's allocation succeeds and y fails due to std::bad_alloc, you've now lost x forever, as the destructor will not be run. And even if the language specified that the destructor *would* run, you would have the problem of deleting an uninitialized y.
Tom
yeah, I know, I cheated with the last one, it was just to show an example of a class managing dynamic allocations internally, so the calling code doesn't have to worry about it. It also fails horribly because it doesn't have copy ctor or assignment operator. But it's simple, and it shows the general idea. :)
jalf
How would you get around the potential memory leak in the third example (if y's allocation fails)?
MCS
I would probably let the Pixel class store RAII wrappers for x and y, so that they are responsible for their own memory allocations, rather than letting Pixel do it. So basically just take it one step further. As Tom pointed out, as soon as one object has to keep track of multiple allocations, it becomes pretty tricky to do correctly. So often, the simple solution is to split it up again, until each object only has to track one allocation.
jalf
Wow, 100 upvotes :O
GMan
wow indeed :D
jalf
+5  A: 

A good general rule of thumb is to NEVER use new unless you absolutely have to. Your programs will be easier to maintain and less error prone if you don't use new as you don't have to worry about where to clean it up.

Steve
+1  A: 

Use pointers and dynamically allocated objects ONLY WHEN YOU MUST. Use statically allocated (global or stack) objects wherever possible.

  • Static objects are faster (no new/delete, no indirection to access them)
  • No object lifetime to worry about
  • Fewer keystrokes More readable
  • Much more robust. Every "->" is a potential access to NIL or invalid memory

To clarify, by 'static' in this context, I mean non-dynamically allocated. IOW, anything NOT on the heap. Yes, they can have object lifetime issues too - in terms of singleton destruction order - but sticking them on the heap doesn't usually solve anything.

Roddy
-1 from someone - care to comment??
Roddy
I can't say I like the "static" advice. First, it doesn't solve the problem (since static objects can't be allocated at runtime), and second, they have plenty of problems of their own (thread safety for example). That said, I didn't -1 you.
jalf
You should also note that statics have both start and stop lifetime issues (google for "static initialization order fiasco"). That said, i didn't -1 you either. So don't do anything to me, please! :)
Johannes Schaub - litb
@Roddy - Did you mean "automatic" (stack-allocated) instead of "static"? (And I didn't -1 you either.)
Fred Larson
@jalf- maybe 'static' wasn't the best word. Are you thinkng of the problem of singleton construction locking from multiple threads?
Roddy
I'm thinking of all variables declared with the "static" keyword. If that wasn't what you meant, you should probably avoid that word. :)Like Fred said, objects on the stack have "automatic" storage class. If that's what you meant, your answer makes a lot more sense.
jalf
+3  A: 

The first case is not always stack allocated. If it's part of an object, it'll be allocated wherever the object is. For example:

class Rectangle {
    Pixel top_left;
    Pixel bottom_right;
}

Rectangle r1; // Pixel is allocated on the stack
Rectangle *r2 = new Rectangle(); // Pixel is allocated on the heap

The main advantages of stack variables are:

  • You can use the RAII pattern to manage objects. As soon as the object goes out of scope, it's destructor is called. Kind of like the "using" pattern in C#, but automatic.
  • There's no possibility of a null reference.
  • You don't need to worry about manually managing the object's memory.
  • It causes fewer memory allocations. Memory allocations, particularly small ones, are likely to be slower in C++ than Java.

Once the object's been created, there's no performance difference between an object allocated on the heap, and one allocated on the stack (or wherever).

However, you can't use any kind of polymorphism unless you're using a pointer - the object has a completely static type, which is determined at compile time.

BlackAura
+7  A: 

Yes, at first that makes sense, coming from a Java or C# background. It doesn't seem like a big deal to have to remember to free the memory you allocated. But then when you get your first memory leak, you'll be scratching your head, because you SWORE you freed everything. Then the second time it happens and the third you'll get even more frustrated. Finally after six months of headaches due to memory issues you'll start to get tired of it and that stack-allocated memory will start to look more and more attractive. How nice and clean -- just put it on the stack and forget about it. Pretty soon you'll be using the stack any time you can get away with it.

But -- there's no substitute for that experience. My advice? Try it your way, for now. You'll see.

eeeeaaii
You forgot to mention its evil twin, double frees. :)Just when you think you've freed all your memory, you start getting errors because you're using memory after it's been freed, or you try to free memory that has already been freed.
jalf
+17  A: 

They are not the same until you add the delete.
Your example is overly trivial, but the destructor may actually contain code that does some real work. This is referred to as RAII.

So add the delete. Make sure it happens even when exceptions are propagating.

Pixel* p = NULL; // Must do this. Otherwise new may throw and then
                 // you would be attempting to delete an invalid pointer.
try
{
    p = new Pixel(); 
    p->x = 2;
    p->y = 5;

    // Do Work
    delete p;
}
catch(...)
{
    delete p;
    throw;
}

If you had picked something more interesting like a file (which is a resource that needs to be closed). Then do it correctly in Java with pointers you need to do this.

File file;
try
{
    file = new File("Plop");
    // Do work with file.
}
finally
{
    try
    {
        file.close();     // Make sure the file handle is closed.
                          // Oherwise the resource will be leaked until
                          // eventual Garbage collection.
    }
    catch(Exception e) {};// Need the extra try catch to catch and discard
                          // Irrelevant exceptions. 

    // Note it is bad practice to allow exceptions to escape a finally block.
    // If they do and there is already an exception propagating you loose the
    // the original exception, which probably has more relevant information
    // about the problem.
}

The same code in C++

std::fstream  file("Plop");
// Do work with file.

// Destructor automatically closes file and discards irrelevant exceptions.

Though people mention the speed (because of finding/allocating memory on the heap). Personally this is not a deciding factor for me (the allocators are very quick and have been optimized for C++ usage of small objects that are constantly created/destroyed).

The main reason for me is object life time. A locally defined object has a very specific and well defined lifetime and the the destructor is guaranteed to be called at the end (and thus can have specific side effects). A pointer on the other hand controls a resource with a dynamic life span.

The main difference between C++ and Java is:

The concept of who owns the pointer. It is the responsibility of the owner to delete the object at the appropriate time. This is why you very rarely see raw pointers like that in real programs (as there is no ownership information associated with a raw pointer). Instead pointers are usually wrapped in smart pointers. The smart pointer defines the semantics of who owns the memory and thus who is responsible for cleaning it up.

Examples are:

 std::auto_ptr<Pixel>   p(new Pixel);
 // An auto_ptr has move semantics.
 // When you pass an auto_ptr to a method you are saying here take this. You own it.
 // Delete it when you are finished. If the receiver takes ownership it usually saves
 // it in another auto_ptr and the destructor does the actual dirty work of the delete.
 // If the receiver does not take ownership it is usually deleted.

 std::tr1::shared_ptr<Pixel> p(new Pixel); // aka boost::shared_ptr
 // A shared ptr has shared ownership.
 // This means it can have multiple owners each using the object simultaneously.
 // As each owner finished with it the shared_ptr decrements the ref count and 
 // when it reaches zero the objects is destroyed.

 boost::scoped_ptr<Pixel>  p(new Pixel);
 // Makes it act like a normal stack variable.
 // Ownership is not transferable.

There are others.

Martin York
nice post. +1
jalf
I like comparing the C++ file usage against Java (makes me smile).
Martin York
agreed. And bonus points because it shows RAII being used to manage other types of resources than just memory allocations.
jalf
+1  A: 

Looking at the question from a different angle...

In C++ you can reference objects using pointers (Foo *) and references (Foo &). Wherever possible, I use a reference instead of a pointer. For instance, when passing by reference to a function/method, using references allows the code to (hopefully) make the following assumptions:

  • The object referenced is not owned by function/method, therefore should not delete the object. It's like saying, "Here, use this data but give it back when you're done".
  • NULL pointer references are less probable. It is possible to be passed a NULL reference, but at least it won't be the fault of the function/method. A reference cannot be reassigned to a new pointer address, so your code could not have accidentally reassigned it to NULL or some other invalid pointer address, causing a page fault.
spoulson
+1  A: 

The question is: why would you use pointers for everything? Stack allocated objects are not only safer and faster to create but there is even less typing and the code looks better.

Nemanja Trifunovic
A: 

Something that I haven't seen mentioned is the increased memory usage. Assuming 4 byte integers and pointers

Pixel p;

will use 8 bytes, and

Pixel* p = new Pixel();

will use 12 bytes, a 50% increase. It doesn't sound like a lot until you allocate enough for a 512x512 image. Then you are talking 2MB instead of 3MB. This is ignoring the overhead of managing the heap with all of these object on them.

KeithB
That should read MB, not GB.
mghie
Your right. I fixed it.
KeithB
"Your right." - whose right?
macbirdie
+1  A: 

The issue isn't pointers per se (aside from introducing NULL pointers), but doing memory management by hand.

The funny part, of course, is that every Java tutorial I've seen has mentioned the garbage collector is such cool hotness because you don't have to remember to call delete, when in practice C++ only requires delete when you call new (and delete[] when you call new[]).

Max Lybbert
A: 

Objects created on the stack are created faster than objects allocated.

Why?

Because allocating memory (with default memory manager) takes some time (to find some empty block or even allocate that block).

Also you don't have memory management problems as the stack object automatically destroys itself when out of scope.

The code is simpler when you don't use pointers. If your design allows you to use stack objects, I recommend that you do it.

I myself wouldn't complicate the problem using smart pointers.

OTOH I have worked a little in the embedded field and creating objects on the stack is not very smart (as the stack allocated for each task/thread is not very big - you must be careful).

So it's a matter of choice and restrictions, there is no response to fit them all.

And, as always don't forget to keep it simple, as much as possible.

Iulian Şerbănoiu
A: 

First case is best unless more members are added to Pixel class. As more and more member gets added, there is a possibility of stack overflow exception

Uday
This is just wrong. Stack usage (i.e. class size) does not increase due to number of functions.
GMan
I meant members means member variables . not methods. sorry if i was not clear.
Uday
A: 

Basically, when you use raw pointers, you do not have RAII.

A: 

Why not use pointer's for everything?

Their slower.

Compiler optimization's will not be as effective with pointer access symantics, you can read up about it in any number of web site's, but here's a decent pdf from Intel.

Check pages, 13,14,17,28,32,36;

Detecting unnecessary memory references in the loop notation:

for (i = j + 1; i <= *n; ++i) { 
X(i) -= temp * AP(k); }

The notation for the loop boundaries contains the pointer or memory reference. The compiler does not have any means to predict whether the value referenced by pointer n is being changed with loop iterations by some other assignment. This uses the loop to reload the value referenced by n for each iteration. The code nerator engine also may deny scheduling a software pipelined loop when otential pointer aliasing is found. Since the value referenced by pointer n is not anging within the loop and it is invariant to the loop index, the loading of *n s to be carried outside of the loop boundaries for simpler scheduling and pointer disambiguation.

... a number of variations on this theme....

Complex memory references. Or in other words, analyzing references such as complex pointer computations, strain the ability of compilers to generate efficient code. Places in the code where the compiler or the hardware is performing a complex computation in order to determine where the data resides, should be the focus of attention. Pointer aliasing and code simplification assist the compiler in recognizing memory access patterns, allowing the compiler to overlap memory access with data manipulation. Reducing unnecessary memory references may expose to the compiler the ability to pipeline the software. Many other data location properties, such as aliasing or alignment, can be easily recognized if memory reference computations are kept simple. Use of strength reduction or inductive methods to simplify memory references is crucial to assisting the compiler.

RandomNickName42