views:

533

answers:

10

Hi, I am reading book called "C++ coding standard" By Herb Sutter, Andrei Alexandrescu and in chapter 42 of this book is an example:(chapter is short so I'm taking the liberty and pasting part of it)

Consider:

 class Socket {
 public:
   // … constructor that opens handle_, destructor that closes handle_, etc. …
   int GetHandle() const {return handle_;} // avoid this - (1) <-why this is bad code?
                                           // and why there is a comment to avoid such code??
 private:
   int handle_; // perhaps an OS resource handle
 };

Data hiding is a powerful abstraction and modularity device (see Items 11 and 41). But hiding data and then giving away handles to it is self-defeating, just like locking your house and leaving the keys in the lock. This is because:

Clients now have two ways to implement functionality: They can use your class's abstraction (Socket) or directly manipulate the implementation that your class relies on (the socket's C-style handle). In the latter case, the object is unaware of significant changes to the resource it thinks it owns. Now the class cannot reliably enrich or embellish functionality (e.g., proxying, logging, collecting statistics) because clients can bypass the embellished, controlled implementationand any of the invariants it thinks it's adding, which makes correct error handling next to impossible (see Item 70).

The class cannot change the underlying implementation of its abstraction because clients depend on it: If Socket is later upgraded to support a different protocol with a different set of low-level primitives, calling code that fetches the underlying handle_ and manipulates it incorrectly will be silently broken.

The class cannot enforce its invariants because calling code can alter state unbeknownst to the class: For example, someone could close the handle being used by a Socket object without going through a Socket member function, thus rendering the object invalid.

Client code can store the handles that your class returns, and attempt to use them after your class's code has invalidated them.

this is a summary from this book:

Don't volunteer too much: Avoid returning handles to internal data managed by your class, so clients won't uncontrollably modify state that your object thinks it owns.

Basically what I'm asking for is:

  1. Why line marked by me as (1) is listed as an example of bad code (I always thought that returning pointers or reference is a bad idea but returning by value is OK. Here they're saying that returning by value is bad idea too?)

  2. Is it possible that there is '&' missing and what they really mean is to not return internal data by reference or pointers?

Thank you.

+2  A: 

1) They're talking about returning a handle to a socket. In many cases, and int would be okay (like the size of an array or something), but in this case that int can be used to call lower-level C functions that will modify the socket without your class's knowledge. Anything that allows the underlying representation of your class to change without its knowledge is bad design, as the chapter states.

2) I doubt they're missing a ref, for the reasons stated above

Mongoose
+5  A: 

The point isn't that you're returning by value, which is fine, the point is that you're returning a resource handle.

Instead, your class should organize methods that access that resource and provide IO surrounding that resource.

If the resource is a file, for example, your class should have a write() and a read() method that reads and writes to/from the file.

Randolpho
+27  A: 

I think what you're missing is that a handle — even though it's represented by an int in the type system — is a reference to something. This isn't returning some informational value — it's returning the object's internal reference to a system resource. The class should manage this handle itself, and the handle should be influenced by the external world only through the class interface.

Chuck
In the words of My Cousin Vinny, "this is a lucid, intelligent, and well thought-out," explanation. +1
Andres Jaan Tack
A: 

The point made by the example is that the handle to a socket is being returned albeit by value as you point out. Once the caller has the handle he can use it to make system calls on his own without going through the abstraction layer provided by your class.

William Bell
+8  A: 

The problem is not with the low level details (the code is perfectly good C++ that way).

The issue is that you are breaking your abstraction. What if in the future, instead of having an int handle, you need to change that to some sort of pointer. You would not be able to make that change without breaking any client that uses your class.

R Samuel Klatchko
Furthermore, by returning some kind of handle to the caller, the caller could do something unexpected like close the underlying socket. The `Socket` class would be unaware that this had happened, and would then try to access a handle that had already been closed.
Greg Hewgill
Samuel, although that *is* a potential problem with the code, it's not the problem that Sutter and Alexandrescu were using that code to illustrate.
Rob Kennedy
+1  A: 

You are correct when you say returning references and pointers to private members are bad practice, but here it is also bad to return the value because the value has internal significance. A handle can be thought of as an address to an object, an address of an object managed deep down inside the operating system.

Allowing external classes to access this handle is going to be bad news. Imagine it is a file handle. An external class can now close your file (knowing its handle), and your wrapping class will know nothing about it. The internals of your class are now in an invalid state.

DanDan
+4  A: 

Note the declaration of handle_:

  int handle_; // perhaps an OS resource handle

Even though you're returning an int by value from C++'s point of view, from the OS's point of view that handle is a "reference" to some OS resource.

Laurence Gonsalves
A: 

The code should reveal information that is not necessary for who uses the class, and it should make the class interface more independent from the actual implementation.
Requiring a resource handler is usually a way to write less code, and makes the class more tied to the implementation, rather than make it dependent from the abstraction.

kiamlaluno
A: 

Here is background on handles in general:

http://www.anvir.com/handle.htm

Handles are opaque references to resources (i.e. memory location) and only the subsystem that gave you the handle knows how the handle is associated with a physical pointer. It is neither value nor pointer nor reference, it's just an alias for a resource that you use with the API that knows what to with it.

So what the book is trying to say is that when you have a class that manages some resource you are supposedly adding a layer of abstraction. However if you give away a handle to a resource, you really don't abstract away the implementation, since your abstraction can be easily circumvented.

A requirement to have handles and functions that take handles as parameters to perform a certain task is mainly dictated by procedural languages like C, which do not have objects and hence cannot hide a certain resource inside a class and only provide you with methods to operate on that resource.

An example of this could be Microsoft MFC C++ library where CWnd class has an accessor that return the window's HWND (i.e. handle):

http://msdn.microsoft.com/en-us/library/d64ehwhz%28VS.71%29.aspx

Igor Zevaka
A: 

"Shared mutable state."

By passing the handle back, the API creates shared mutable state, which is something that should be avoided whenever possible.

Consider an alternate API, that exposed a Close() method rather than GetHandle(). By never exposing the handle directly, the class then guarantees that it will be the only one to close the handle. The handle becomes private state of the class.

This might seem like it would bloat your API, but there's a real advantage - the class will then know that any modifications made to the state have gone through it. In the existing implementation, it has to check the handle's status every time it does anything, as it doesn't own that state. By hiding the handle, the class gets rid of that error case entirely.

kyoryu