views:

1521

answers:

11

I've been using C++ for a short while, and I've been wondering about the new keyword. Simply, should I be using it, or not?

1) With the new keyword...

MyClass* myClass = new MyClass();
myClass->MyField = "Hello world!";

2) Without the new keyword...

MyClass myClass;
myClass.MyField = "Hello world!";

From an implementation perspective, they don't seem that different (but I'm sure they are)... to me, as a new C++ user, the advantage of the 2nd method seems that you don't have to type as much, which is cool... However, my primary language is C#, and of course the 1st method is what I'm used to.

The difficulty seems to be that method 1 is harder to use with the std C++ classes.

Which method should I use?

Update 1:

I recently used the new keyword for heap memory (or free store) for a large array which was going out of scope (i.e. being returned from a function). Where before I was using the stack, which caused half of the elements to be corrupt outside of scope, switching to heap usage ensured that the elements were in tact. Yay!

Update 2:

A friend of mine recently told me there's a simple rule for using the new keyword; every time you type new, type delete.

Foobar *foobar = new Foobar();
delete foobar; // TODO: Move this to the right place.

This helps to prevent memory leaks, as you always have to put the delete somewhere (i.e. when you cut and paste it to either a destructor or otherwise).

A: 

The short answer is yes the "new" keyword is incredibly important as when you use it the object data is stored on the heap as opposed to the stack, which is most important!

RAGNO
+1  A: 

The simple answer is yes - new() creates an object on the heap (with the unfortunate side effect that you have to manage its lifetime (by explicitly calling delete on it), whereas the second form creates an object in the stack in the current scope and that object will be destroyed when it goes out of scope.

Timo Geusch
I believe you meant to put: The simple answer is yes
TStamper
Indeed, thank you.
Timo Geusch
+27  A: 

Method 1 (using new)

  • Allocates memory for the object on the free store (This is frequently the same thing as the heap)
  • Requires you to explicitly delete your object later. (If you don't delete it, you could create a memory leak)
  • Memory stays allocated until you delete it. (i.e. you could return an object that you created using new)
  • The example in the question will leak memory unless the pointer is deleted; and it should always be deleted, regardless of which control path is taken, or if exceptions are thrown.

Method 2 (not using new)

  • Allocates memory for the object on the stack (where all local variables go) There is generally less memory available for the stack; if you allocate too many objects, you risk stack overflow.
  • You won't need to delete it later.
  • Memory is no longer allocated when it goes out of scope. (i.e. you shouldn't return a pointer to an object on the stack)

As far as which one to use; you choose the method that works best for you, given the above constraints.

Some easy cases:

  • If you don't want to worry about calling delete, (and the potential to cause memory leaks) you shouldn't use new.
  • If you'd like to return a pointer to your object from a function, you must use new
Daniel LeCheminant
One nitpick -- I believe that the new operator allocates memory from "free store", while malloc allocates from "heap". These are not guaranteed to be the same thing, although in practice they usually are. See http://www.gotw.ca/gotw/009.htm.
Fred Larson
I think your answer could be clearer on which to use. (99% of the time, the choice is simple. Use method 2, on a wrapper object which calls new/delete in constructor/destructor)
jalf
@jalf: Method 2 is the one that doesn't use the new :-/ In any case, there are many times that you code will be much simpler (e.g. handling error cases) using Method 2 (the one without the new)
Daniel LeCheminant
Another nitpick... You should make it more obvious that Nick's first example leaks memory, whereas his second doesn't, even in the face of exceptions.
Arafangion
@Fred, Arafangion: Thanks for your insight; I've incorporated your comments into the answer.
Daniel LeCheminant
A: 

The second method creates the instance on the stack, along with such things as something declared int and the list of parameters that are passed into the function.

The first method makes room for a pointer on the stack, which you've set to the location in memory where a new MyClass has been allocated on the heap - or free store.

The first method also requires that you delete what you create with new, whereas in the second method, the class is automatically destructed and freed when it falls out of scope (the next closing brace, usually).

greyfade
A: 

Without the new keyword you're storing that on call stack. Storing excessively large variables on stack will lead to stack overflow.

vartec
+7  A: 

Which method should I use?

This is almost never determined by your typing preferences but by the context. If you need to keep the object across a few stacks or if it's too heavy for the stack you allocate it on the free store. Also, since you are allocating an object, you are also responsible for releasing the memory. Lookup the delete operator.

To ease the burden of using free-store management people have invented stuff like auto_ptr and unique_ptr. I strongly recommend you take a look at these. They might even be of help to your typing issues ;-)

dirkgently
+1  A: 

Are you passing myClass out of a function, or expecting it to exist outside that function? As some others said, it is all about scope when you aren't allocating on the heap. When you leave the function, it goes away (eventually). One of the classic mistakes made by beginners is the attempt to create a local object of some class in a function and return it without allocating it on the heap. I can remember debugging this kind of thing back in my earlier days doing c++.

itsmatt
A: 

If your variable is used only within the context of a single function, you're better off using a stack variable, i.e., Option 2. As others have said, you do not have to manage the lifetime of stack variables - they are constructed and destructed automatically. Also, allocating/deallocating a variable on the heap is slow by comparison. If your function is called often enough, you'll see a tremendous performance improvement if use stack variables versus heap variables.

That said, there are a couple of obvious instances where stack variables are insufficient.

If the stack variable has a large memory footprint, then you run the risk of overflowing the stack. By default, the stack size of each thread is 1 MB on Windows. It is unlikely that you'll create a stack variable that is 1 MB in size, but you have to keep in mind that stack utilization is cumulative. If your function calls a function which calls another function which calls another function which..., the stack variables in all of these functions take up space on the same stack. Recursive functions can run into this problem quickly, depending on how deep the recursion is. If this is a problem, you can increase the size of the stack (not recommended) or allocate the variable on the heap using the new operator (recommended).

The other, more likely condition is that your variable needs to "live" beyond the scope of your function. In this case, you'd allocate the variable on the heap so that it can be reached outside the scope of any given function.

Matt Davis
A: 

I have to ask - what does the tutorial you are using have to say? Frankly, you are unlikely to learn a language as complex as C++ by asking questions like this on SO. If you are not using a tutorial, an excellent one is Accelerated C++ by Koenig & Moo.

anon
Please refer to the FAQ. See the first section (No question is too trivial or too "newbie") and the "Be nice." section.
nbolton
+6  A: 

There is an important difference between the two.

Everything not allocated with new is placed on the stack, exactly like value types in C#. Everything allocated with new is allocated on the heap, and a pointer to it is returned, exactly like reference types in C#.

Anything allocated on the stack has to have a constant size, determined at compile-time (the compiler has to set the stack pointer correctly, or if the object is a member of another class, it has to adjust the size of that other class). That's why arrays in C# are reference types. They have to be, because with reference types, we can decide at runtime how much memory to ask for. And the same applies here. Only arrays with constant size (a size that can be determined at compile-time) can be allocated with automatic storage duration (on the stack). Dynamically sized arrays have to be allocated on the heap, by calling new.

(And that's where any similarity to C# stops)

Now, anything allocated on the stack has "automatic" storage duration (you can actually declare a variable as auto, but this is the default if no other storage type is specified so the keyword isn't really used in practice, but this is where it comes from)

Automatic storage duration means exactly what it sounds like, the duration of the variable is handled automatically. By contrast, anything allocated on the heap has to be manually deleted by you. Here's an example:

void foo() {
  bar b;
  bar* b2 = new bar();
}

This function creates three values worth considering:

On line 1, it declares a variable b of type bar on the stack (automatic duration).

On line 2, it declares a bar pointer b2 on the stack (automatic duration), and calls new, allocating a bar object on the heap. (dynamic duration)

When the function returns, the following will happen: First, b2 goes out of scope (order of destruction is always opposite of order of construction). But b2 is just a pointer, so nothing happens, the memory it occupies is simply freed. And importantly, the memory it points to (the bar instance on the heap) is NOT touched. Only the pointer is freed, because only the pointer had automatic duration. Second, b goes out of scope, so since it has automatic duration, its destructor is called, and the memory is freed.

And the barinstance on the heap?` It's probably still there. No one bothered to delete it, so we've leaked memory.

From this example, we can see that anything with automatic duration is guaranteed to have its destructor called when it goes out of scope. That's useful. But anything allocated on the heap lasts as long as we need it to, and can be dynamically sized, as in the case of arrays. That is also useful. We can use that to manage our memory allocations. What if the Foo class allocated some memory on the heap in its constructor, and deleted that memory in its destructor. Then we could get the best of both worlds, safe memory allocations that are guaranteed to be freed again, but without the limitations of forcing everything to be on the stack.

And that is pretty much exactly how most C++ code works. Look at the standard library's std::vector for example. That is typically allocated on the stack, but can be dynamically sized and resized. And it does this by internally allocating memory on the heap as necessary. The user of the class never sees this, so there's no chance of leaking memory, or forgetting to clean up what you allocated.

This principle is called RAII (Resource Acquisition is Initialization), and it can be extended to any resource that must be acquired and released. (network sockets, files, database connections, synchronization locks). All of them can be acquired in the constructor, and released in the destructor, so you're guaranteed that all resources you acquire will get freed again.

As a general rule, never use new/delete directly from your high level code. Always wrap it in a class that can manage the memory for you, and which will ensure it gets freed again. (Yes, there may be exceptions to this rule. In particular, smart pointers require you to call new directly, and pass the pointer to its constructor, which then takes over and ensures delete is called correctly. But this is still a very important rule of thumb)

jalf
+1 Very detailed; great example
Daniel LeCheminant
*"Everything not allocated with new is placed on the stack"*Not in the systems I've worked on... usually intialized (and uninit.) global(static) data are placed in their own segments. For example, .data, .bss, etc... linker segments. Pedantic, I know...
Dan
Of course, you're right. I wasn't really thinking about static data. My bad, of course. :)
jalf
A: 

If you are writing in C++ you are probably writing for performance. Using new and the free store is much slower than using the stack (especially when using threads) so only use it when you need it.

As others have said, you need new when your object needs to live outside the function or object scope, the object is really large or when you don't know the size of an array at compile time.

Also, try to avoid ever using delete. Wrap your new into a smart pointer instead. Let the smart pointer call delete for you.

There are some cases where a smart pointer isn't smart. Never store std::auto_ptr<> inside a STL container. It will delete the pointer too soon because of copy operations inside the container. Another case is when you have a really large STL container of pointers to objects. boost::shared_ptr<> will have a ton of speed overhead as it bumps the reference counts up and down. The better way to go in that case is to put the STL container into another object and give that object a destructor that will call delete on every pointer in the container.

Zan Lynx