views:

762

answers:

13

Lets say I know a guy who is new to C++. He does not pass around pointers (rightly so) but he refuses to pass by reference. He uses pass by value always. Reason being that he feels that "passing objects by reference is a sign of a broken design".

The program is a small graphics program and most of the passing in question is mathematical Vector(3-tuple) objects. There are some big controller objects but nothing more complicated than that.

I'm finding it hard to find a killer argument against only using the stack.

I would argue that pass by value is fine for small objects such as vectors but even then there is a lot of unnecessary copying occurring in the code. Passing large objects by value is obviously wasteful and most likely not what you want functionally.

On the pro side, I believe the stack is faster at allocating/deallocating memory and has a constant allocation time.

The only major argument I can think of is that the stack could possibly overflow, but I'm guessing that it is improbable that this will occur? Are there any other arguments against using only the stack/pass by value as opposed to pass by reference?

+3  A: 

I would say that Not using pointers in C is a sign of a newbie programmer.

It sounds like your friend is scared of pointers.

Remember, C++ pointers were actually inherited from the C language, and C was developed when computers were much less powerful. Nevertheless, speed and efficiency continue to be vital until this day.

So, why use pointers? They allow the developer to optimize a program to run faster or use less memory that it would otherwise! Referring to the memory location of a data is much more efficient then copying all the data around.

Pointers usually are a concept that is difficult to grasp for those beginning to program, because all the experiments done involve small arrays, maybe a few structs, but basically they consist of working with a couple of megabytes (if you're lucky) when you have 1GB of memory laying around the house. In this scene, a couple of MB are nothing and it usually is too little to have a significant impact on the performance of your program.

So let's exaggerate that a little bit. Think of a char array with 2147483648 elements - 2GB of data - that you need to pass to function that will write all the data to the disk. Now, what technique do you think is going to be more efficient/faster?

  • Pass by value, which is going to have to re-copy those 2GB of data to another location in memory before the program can write the data to the disk, or
  • Pass by reference, which will just refer to that memory location.

What happens when you just don't have 4GB of RAM? Will you spend $ and buy chips of RAM just because you are afraid of using pointers?

Re-copying the data in memory sounds a bit redundant when you don't have to, and its a waste of computer resource.

Anyway, be patient with your friend. If he would like to become a serious/professional programmer at some point in his life he will eventually have to take the time to really understand pointers.

Good Luck.

karlphillip
No, on the contrary: “Not using pointers” is a sign of a *really*, *really* **professional** C++ programmer. Most bad C++ code uses way too many pointers. Good C++ hardly uses them at all.
Konrad Rudolph
Where you say "Remember, C++ pointers were actually inherited by the C language" you probably mean to say "from the C language"
Amoss
Pointers often aren't necessary. A good C++ programmer generally avoids pointers. Not always and in every situation, but they're rarely necessary in well written C++.
jalf
@Konrad: You've mixed up "don't use raw pointers for memory management" with "don't use raw pointers at all". Which is total baloney. A raw pointer is applicable anywhere an array index is, because a raw pointer IS an array index.
Ben Voigt
@Ben: No. Raw pointers and array indexes are totally different concepts. Their implementation just happens to be the same **under specific circumstances**. Of course I actually *do* use pointers (but only very occasionally, and in fact only so rarely that it’s easier to say that I don’t use them at all!) but when I do, they’re well hidden inside abstractions. – The same is true for manual memory management but I really was talking about pointers, not manual memory management.
Konrad Rudolph
@Ben: In fact, I’ve just grepped the project I’m currently working on for occurrences of `*` – in the whole project of several thousand lines of code, **I don’t declare a single pointer** (except for `char const*` to handle the arguments passed to `main`). In fact, the library doesn’t even need/use smart pointers at the moment, due to the peculiar memory ownership model that is used. I’ll grant that this is a rather special case and that most projects probably need some other kind of object ownership, which implies using smart pointers.
Konrad Rudolph
You do know that `std::vector::iterator` is usually a typedef for a pointer, right? You may be better off searching for `->` than `*` if you want to identify pointers.
Ben Voigt
+6  A: 

There are all sorts of things that cannot be done without using references - starting with a copy constructor. References (or pointers) are fundamental and whether he likes it or not, he is using references. (One advantage, or maybe disadvantage, of references is that you do not have to alter the code, in general, to pass a (const) reference.) And there is no reason not to use references most of the time.

And yes, passing by value is OK for smallish objects without requirements for dynamic allocation, but it is still silly to hobble oneself by saying "no references" without concrete measurements that the so-called overhead is (a) perceptible and (b) significant. "Premature optimization is the root of all evil"1.

1 Various attributions, including C A Hoare (although apparently he disclaims it).

Jonathan Leffler
+2  A: 

As already mentioned the big difference between a reference and a pointer is that a pointer can be null. If a class requires data a reference declaration will make it required. Adding const will make it 'read only' if that is what is desired by the caller.

The pass-by-value 'flaw' mentioned is simply not true. Passing everything by value will completely change the performance of an application. It is not so bad when primitive types (i.e. int, double, etc.) are passed by value but when a class instance is passed by value temporary objects are created which requires constructors and later on destructor's to be called on the class and on all of the member variable in the class. This is exasperated when large class hierarchies are used because parent class constructors/destructor's must be called as well.

Also, just because the vector is passed by value does not mean that it only uses stack memory. heap may be used for each element as it is created in the temporary vector that is passed to the method/function. The vector itself may also have to reallocate via heap if it reaches its capacity.

If pass by value is being so that the callers values are not modified then just use a const reference.

skimobear
+14  A: 

Subtyping-polymorphism is a case where passing by value wouldn't work because you would slice the derived class to its base class. Maybe to some, using subtyping-polymorphism is bad design?

stefaanv
I guess polymorphism isn't necessary if you just use templates and implicit interfaces everywhere.
Jon Purdy
@Jon Purdy: Use of templates is also a kind of polymorphism - It's called Parametric Polymorphism. I think what @Stefaanv is referring to as polymorphism in his post is Subtyping Polymorphism, and I feel he should reword his answer accordingly.
one-zero-zero-one
@Jon Purdy, @Stefaanv: See 1.1 and 1.2 in this Wikipedia article: http://en.wikipedia.org/wiki/Type_polymorphism
one-zero-zero-one
(Runtime) polymorphism over numeric types would usually be a bad design.
Potatoswatter
@Potatoswatter why? It makes sense for me to write a function that can add any two numeric values. Floats, ints, complex, fractions, ...
wilhelmtell
@wilhelmtell: Usually templates are better for numerics.
Potatoswatter
@wilhelmtell: Unless I misunderstand your example, that sounds like a great use for a function telmpate (that is, for compile-time polymorphism).
James McNellis
@Cheryl: from wikipedia: Subtype polymorphism, almost universally called just polymorphism in the context of object-oriented programming, so I guess I'm still too much in OO. I'll reword.
stefaanv
@Stefaanv: C++ being a multiparadigm language (and one that supports both major types of polymorphism), the use of correct terminology becomes very important.
one-zero-zero-one
+5  A: 

Your friend's problem is not his idea as much as his religion. Given any function, always consider the pros and cons of passing by value, reference, const reference, pointer or smart pointer. Then decide.

The only sign of broken design I see here is your friend's blind religion.

That said, there are a few signatures that don't bring much to the table. Taking a const by value might be silly, because if you promise not to change the object then you might as well not make your own copy of it. Unless its a primitive, of course, in which case the compiler can be smart enough to take a reference still. Or, sometimes it's clumsy to take a pointer to a pointer as argument. This adds complexity; instead, you might be able to get away with it by taking a reference to a pointer, and get the same effect.

But don't take these guidelines as set in stone; always consider your options because there is no formal proof that eliminates any alternative's usefulness.

  1. If you need to change the argument for your own needs, but don't want to affect the client, then take the argument by value.
  2. If you want to provide a service to the client, and the client is not closely related to the service, then consider taking an argument by reference.
  3. If the client is closely related to the service then consider taking no arguments but write a member function.
  4. If you wish to write a service function for a family of clients that are closely related to the service but very distinct from each other then consider taking a reference argument, and perhaps make the function a friend of the clients that need this friendship.
  5. If you don't need to change the client at all then consider taking a const-reference.
wilhelmtell
Using top-level const on a parameter is no more silly than using const on local variables -- it expresses your intent to the compiler which lets it generate better warnings and optimizations.
Ben Voigt
I'm not saying it's silly to specify const on a parameter value. I only think that sometimes it's silly to copy a parameter if you never intend to change it. Granted, maybe the mere call for a copy has some necessary effect, and maybe there are other circumstances this makes sense. All I'm saying is think about it.
wilhelmtell
+3  A: 

First thing is, stack rarely overflows outside this website, except in the recursion case.

About his reasoning, I think he might be wrong because he is too generalized, but what he has done might be correct... or not?

For example, the Windows Forms library use Rectangle struct that have 4 members, the Apple's QuartzCore also has CGRect struct, and those structs always passed by value. I think we can compare that to Vector with 3 floating-point variable.

However, as I do not see the code, I feel I should not judge what he has done, though I have a feeling he might did the right thing despite of his over generalized idea.

tia
+5  A: 

Reason being that he feels that "passing objects by reference is a sign of a broken design".

Although this is wrong in C++ for purely technical reasons, always using pass-by-value is a good enough approximation for beginners – it’s certainly much better than passing everything by pointers (or perhaps even than passing everything by reference). It will make some code inefficient but, hey! As long as this doesn’t bother your friend, don’t be unduly disturbed by this practice. Just remind him that someday he might want to reconsider.

On the other hand, this:

There are some big controller objects but nothing more complicated than that.

is a problem. Your friend is talking about broken design, and then all the code uses are a few 3D vectors and large control structures? That is a broken design. Good code achieves modularity through the use of data structures. It doesn’t seem as though this were the case.

… And once you use such data structures, code without pass-by-reference may indeed become quite inefficient.

Konrad Rudolph
+2  A: 

The answers that I've seen so far have all focused on performance: cases where pass-by-reference is faster than pass-by-value. You may have more success in your argument if you focus on cases that are impossible with pass-by-value.

Small tuples or vectors are a very simple type of data-structure. More complex data-structures share information, and that sharing can't be represented directly as values. You either need to use references/pointers or something that simulates them such as arrays and indices.

Lots of problems boil down to data that forms a Graph, or a Directed-Graph. In both cases you have a mixture of edges and nodes that need to be stored within the data-structure. Now you have the problem that the same data needs to be in multiple places. If you avoid references then firstly the data needs to be duplicated, and then every change needs to be carefully replicated in each of the other copies.

Your friend's argument boils down to saying: tackling any problem complex enough to be represented by a Graph is a bad-design....

Amoss
+1  A: 

The only major argument I can think of is that the stack could possibly overflow, but I'm guessing that it is improbable that this will occur? Are there any other arguments against using only the stack/pass by value as opposed to pass by reference?

Well, gosh, where to start...

  1. As you mention, "there is a lot of unnecessary copying occurring in the code". Let's say you've got a loop where you call a function on these objects. Using a pointer instead of duplicating the objects can accelerate execution by one or more orders of magnitude.

  2. You can't pass a variable-sized data structures, arrays, etc. around on the stack. You have to dynamically allocate it and pass a pointers or reference to the beginning. If your friend hasn't run into this, then yes, he's "new to C++."

  3. As you mention, the program in question is simple and mostly uses quite small objects like graphics 3-tuples, which if the elements are doubles would be 24 bytes apiece. But in graphics, it's common to deal with 4x4 arrays, which handle both rotation and translation. Those would be 128 bytes apiece, so if a program that had to deal with those would be five times slower per function call with pass-by-value due to the increased copying. With pass-by-reference, passing a 3-tuple or a 4x4 array in a 32-bit executable would just involve duplicating a single 4-byte pointer.

  4. On register-rich CPU architecures like ARM, PowerPC, 64-bit x86, 680x0 - but not 32-bit x86 - pointers (and references, which are secretly pointers wearing fancy syntatical clothing) are commonly be passed or returned in a register, which is really freaking fast compared to the memory access involved in a stack operation.

  5. You mention the improbability of running out of stack space. And yes, that's so on a small program one might write for a class assignment. But a couple of months ago, I was debugging commercial code that was probably 80 function calls below main(). If they'd used pass-by-value instead of pass-by-reference, the stack would have been ginormous. And lest your friend think this was a "broken design", this was actually a WebKit-based browser implemented on Linux using GTK+, all of which is very state-of-the-art, and the function call depth is normal for professional code.

  6. Some executable architectures limit the size of an individual stack frame, so even though you might not run out of stack space per se, you could exceed that and wind up with perfectly valid C++ code that wouldn't build on such a platform.

I could go on and on.

If your friend is interested in graphics, he should take a look at some of the common APIs used in graphics: OpenGL and XWindows on Linux, Quartz on Mac OS X, Direct X on Windows. And he should look at the internals of large C/C++ systems like the WebKit or Gecko HTML rendering engines, or any of the Mozilla browsers, or the GTK+ or Qt GUI toolkits. They all pass by anything much larger than a single integer or float by reference, and often fill in results by reference rather than as a function return value.

Nobody with any serious real world C/C++ chops - and I mean nobody - passes data structures by value. There's a reason for this: it's just flipping inefficient and problem-prone.

Bob Murphy
"nobody passes data structures by value" -> [Want Speed? Pass by Value.](http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/)
FredOverflow
Konrad Rudolph
excellent summary, thanks
davka
+4  A: 

I think there is a huge misunderstanding in the question itself.

There is not relationship between stack or heap allocated objects on the one hand and pass by value or reference or pointer on the other.

Stack vs Heap allocation

Always prefer stack when possible, the object's lifetime is then managed for you which is much easier to deal with.

It might not be possible in a couple of situations though:

  • Virtual construction (think of a Factory)
  • Shared Ownership (though you should always try to avoid it)

And I might miss some, but in this case you should use SBRM (Scope Bound Resources Management) to leverage the stack lifetime management abilities, for example by using smart pointers.

Pass by: value, reference, pointer

First of all, there is a difference of semantics:

  • value, const reference: the passed object will not be modified by the method
  • reference: the passed object might be modified by the method
  • pointer/const pointer: same as reference (for the behavior), but might be null

Note that some languages (the functional kind like Haskell) do not offer reference/pointer by default. The values are immutable once created. Apart from some work-arounds for dealing with the exterior environment, they are not that restricted by this use and it somehow makes debugging easier.

Your friend should learn that there is absolutely nothing wrong with pass-by-reference or pass-by-pointer: for example thing of swap, it cannot be implemented with pass-by-value.

Finally, Polymorphism does not allow pass-by-value semantics.

Now, let's speak about performances.

It's usually well accepted that built-ins should be passed by value (to avoid an indirection) and user-defined big classes should be passed by reference/pointer (to avoid copying). big in fact generally means that the Copy Constructor is not trivial.

There is however an open question regarding small user-defined classes. Some articles published recently suggest that in some case pass-by-value might allow better optimization from the compiler, for example, in this case:

Object foo(Object d) { d.bar(); return d; }

int main(int argc, char* argv[])
{
  Object o;
  o = foo(o);
  return 0;
}

Here a smart compiler is able to determine that o can be modified in place without any copying! (It is necessary that the function definition be visible I think, I don't know if Link-Time Optimization would figure it out)

Therefore, there is only one possibility to the performance issue, like always: measure.

Matthieu M.
Haskell does not have objects, only values.
FredOverflow
@Fred: patched, and I also added the plural form since I had forgotten it. Thanks for watching on me!
Matthieu M.
@Matthieu M. very true there is a blending of different concepts in the question. I could rephrase it as: Consequences of only using pass-by-value in C++
Brian Heylin
+3  A: 

I would argue that pass by value is fine for small objects such as vectors but even then there is a lot of unnecessary copying occurring in the code. Passing large objects by value is obviously wasteful and most likely not what you want functionally.

It's not quite as obvious as you might think. C++ compilers perform copy elision very aggressively, so you can often pass by value without incurring the cost of a copy operation. And in some cases, passing by value might even be faster.

Before condemning the issue for performance reasons, you should at the very least produce the benchmarks to back it up. And they might be hard to create because the compiler typically eliminates the performance difference.

So the real issue should be one of semantics. How do you want your code to behave? Sometimes, reference semantics are what you want, and then you should pass by reference. If you specifically want/need value semantics then you pass by value.

There is one point in favor of passing by value. It's helpful in achieving a more functional style of code, with fewer side effects and where immutability is the default. That makes a lot of code easier to reason about, and it may make it easier to parallelize the code as well.

But in truth, both have their place. And never using pass-by-reference is definitely a big warning sign.

For the last 6 months or so, I've been experimenting with making pass-by-value the default. If I don't explicitly need reference semantics, then I try to assume that the compiler will perform copy elision for me, so I can pass by value without losing any efficiency.

So far, the compiler hasn't really let me down. I'm sure I'll run into cases where I have to go back and change some calls to passing by reference, but I'll do that when I know that

  • performance is a problem, and
  • the compiler failed to apply copy elision
jalf
@jalf Thanks for the response, I wasn't aware of the copy elision optimization, looks interesting. And yes I have no actual data to back up the the copy performance comment, all I have is second hand articles on the subject.
Brian Heylin
@jalf: can you back this up? I’m pretty sure that the compiler (any compiler) will not perform copy elision on arguments except when these are rvalues. The article you linked to implies as much. In theory, compilers could do much more – namely elide copies as long as no non-`const` method is called on the object. But as far as I know, this is *not* done.
Konrad Rudolph
+1, Best answer here. :-)
missingfaktor
A: 

Buy your friend a good c++ book. Passing non-trivial objects by reference is a good practice and saves you a lot of unneccessary constructor/destructor calls. This has also nothing to do with allocating on free store vs. using stack. You can (or should) pass objects allocated on program stack by reference without any free store usage. You also can ignore free store completely, but that throws you back to the old fortran days which your friend probably hadn't in mind - otherwise he would pick an ancient f77 compiler for your project, wouldn't he...?

paul_71
I just want to add that C++ supports the oxymoron *const passed-by-reference* parameter type, which can also be used to skip copy constructor and avoid using pointer notation at the same time. Sometimes, you can have your cake and eat it too.
tia
+1  A: 

Wow, there are already 13 answers… I didn't read all in detail but I think this is quite different from the others…

He has a point. The advantage of pass-by-value as a rule is that subroutines cannot subtly modify their arguments. Passing non-const references would indicate that every function has ugly side effects, indicating poor design.

Simply explain to him the difference between vector3 & and vector3 const&, and demonstrate how the latter may be initialized by a constant as in vec_function( vector3(1,2,3) );, but not the former. Pass by const reference is a simple optimization of pass by value.

Potatoswatter