A: 

Interesting question. It's a problem in C++ to exclusively use references I guess - in Java the references are more flexible and can be null. I can't remember if it's legal C++ to force a null reference:

MyType *pObj = nullptr;
return *pObj

But I consider this dangerous. Again in Java I'd throw an exception as this is common there, but I rarely see exceptions used so freely in C++. If I was making a puclic API for a reusable C++ component and had to return a reference, I guess I'd go the exception route. My real preference is to have the API return a pointer; I consider pointers an integral part of C++.

MidnightGun
No, we can't set a null reference in C++, I believe that one of the reasons is that we don't have a GC.
Augusto Radtke
That code snippet is definitely not legal! You're dereferencing null, which means "undefined behaviour" (but probably a crash).Java references should really be called pointers. They can be null and you can change which object they point to, neither of which is true for C++ references.
Sam Stokes
@Sam - actually, this null dereference will tend to crash or not crash depending on a) how MyType is defined, b) which compiler it is invoked on c) what the return type of this function is.
Aaron
Didn't anyone notice I said I wasn't sure if was legal? I know some compilers (used to) allow it, but I advised againat it anyway.
MidnightGun
A: 

what I prefer doing in situations like this is having a throwing "get" and for those circumstances where performance matter or failiure is common have a "tryGet" function along the lines of "bool tryGet(type key, value **pp)" whoose contract is that if true is returned then *pp == a valid pointer to some object else *pp is null.

Torbjörn Gyllebring
+16  A: 

The STL deals with this situation by using iterators. For example, the std::map class has a similar function:

iterator find( const key_type& key );

If the key isn't found, it returns 'end()'. You may want to use this iterator approach, or to use some sort of wrapper for your return value.

Martin Cote
Yeah, why not implement your container to match the prototypical stl container interface. Then you bring all the power of stl to work on your list class.
Scott Langham
What would be the cost of creating an iterator against using the exists() function first?
Augusto Radtke
depend how fast it is to search your container - if (exists()) get(value) wil require 2 searches. In the STL, creating an iterator is almost free - the compiler will often optimise it away entirely.
gbjbaanb
Under the hood, an iterator might just be an index or pointer into the collection and therefore free to create - the reason it's useful here is that it has mylist.end() could be more useful than "null". Also C++ coders expect iterators to be invalidated by mutators: with a pointer it's less clear.
Steve Jessop
+1  A: 

How about returning a shared_ptr as the result. This can be null if the item wasn't found. It works like a pointer, but it will take care of releasing the object for you.

Scott Langham
+3  A: 

Don't use an exception in such a case. C++ has a nontrivial performance overhead for such exceptions, even if no exception is thrown, and it additially makes reasoning about the code much harder (cf. exception safety).

Best-practice in C++ is one of the two following ways. Both get used in the STL:

  • As Martin pointed out, return an iterator. Actually, your iterator can well be a typedef for a simple pointer, there's nothing speaking against it; in fact, since this is consistent with the STL, you could even argue that this way is superior to returning a reference.
  • Return a std::pair<bool, yourvalue>. This makes it impossible to modify the value, though, since a copycon of the pair is called which doesn't work with referende members.

/EDIT:

This answer has spawned quite some controversy, visible from the comments and not so visible from the many downvotes it got. I've found this rather surprising.

This answer was never meant as the ultimate point of reference. The “correct” answer had already been given by Martin: execeptions reflect the behaviour in this case rather poorly. It's semantically more meaningful to use some other signalling mechanism than exceptions.

Fine. I completely endorse this view. No need to mention it once again. Instead, I wanted to give an additional facet to the answers. While minor speed boosts should never be the first rationale for any decision-making, they can provide further arguments and in some (few) cases, they may even be crucial.

Actually, I've mentioned two facets: performance and exception safety. I believe the latter to be rather uncontroversial. While it's extremely hard to give strong exceptions guarantees (the strongest, of course, being “nothrow”), I believe it's essential: any code that is guaranteed to not throw exceptions makes the whole program easier to reason about. Many C++ experts emphasize this (e.g. Scott Meyers in item 29 of “Effective C++”).

About speed. Martin York has pointed out that this no longer applies in modern compilers. I respectfully disagree. The C++ language makes it necessary for the environment to keep track, at runtime, of code paths that may be unwound in the case of an exception. Now, this overhead isn't really all that big (and it's quite easy to verify this). “nontrivial” in my above text may have been too strong.

However, I find it important to draw the distinction between languages like C++ and many modern, “managed” languages like C#. The latter has no additional overhead as long as no exception is thrown because the information necessary to unwind the stack is kept anyway. By and large, stand by my choice of words.

Konrad Rudolph
FYI, C# has non-trivial exception performance too, so its pretty much bad practice everywhere.
gbjbaanb
In modern C++ compilers. If no exceptions are thrown there is practically no over head to potentially throwing exceptions. So your statement is completely inaccurate for the current state (you were correct if you are using a compiler from 10 years ago).
Martin York
Yes Exception safety is a concern. But it is an essential part of any modern C++ program and you should always be thinking about.
Martin York
According to some tests I've just run, there is a 45% overhead to executing a function if a try/catch is added around a call in it to find() in a map<int,int> with 10M entries. That's with no optimization, to try to avoid inlining. With -O3 or with a try/catch directly in the loop there is no cost.
Steve Jessop
Also: entering the try/catch block was almost free (0-5%) compared with find() in an EMPTY map. So in summary entering try/catch is free, but entering a function containing a try/catch is more expensive than entering a function which doesn't, by about 0.4 seconds for 1 million repeats. All on gcc.
Steve Jessop
This is good advice, but not just perf reasons. exception handling is for 'excepions' .. a semantic quantity. 'not found' is a meaningful and normal result for search.
Aaron
That is indeed the mantra, and I prefer it to spurious perf. claims. But exceptions are for when you can't return, and returning non-result values is bad too. I like std::map's "find", but the method name here is "get" - is that like "search" or is it like "give me the thing I expect to be there"?
Steve Jessop
Ultimately for me, the fact that C++ doesn't have checked exceptions is a big deal - keeping track of exceptions is that much less onerous if they only occur in circumstances likely to be application-fatal, because you can often ignore them. I'm more liberal with exceptions in Java than in C++.
Steve Jessop
Nullable types are a good compromise of course, and pointers an icky way to achieve that. And undefined behaviour is always an option if the caller is supposed to know the value is there.
Steve Jessop
+5  A: 

The problem with exists() is that you'll end up searching twice for things that do exist (first check if it's in there, then find it again). This is inefficient, particularly if (as its name of "list" suggests) your container is one where searching is O(n).

Sure, you could do some internal caching to avoid the double search, but then your implementation gets messier, your class becomes less general (since you've optimised for a particular case), and it probably won't be exception-safe or thread-safe.

Sam Stokes
+2  A: 

STL Iterators?

The "iterator" idea proposed before me is interesting, but the real point of iterators is navigation through a container. Not as an simple accessor.

If you're accessor is one among many, then iterators are the way to go, because you will be able to use them to move in the container. But if your accessor is a simple getter, able to return either the value or the fact there is no value, then your iterator is perhaps only a glorified pointer...

Which leads us to...

Smart pointers?

The point of smart pointers is to simplify pointer ownership. With a shared pointer, you'll get a ressource (memory) which will be shared, at the cost of an overhead (shared pointers needs to allocate an integer as a reference counter...).

You have to choose: Either your Value is already inside a shared pointer, and then, you can return this shared pointer (or a weak pointer). Or Your value is inside a raw pointer. Then you can return the row pointer. You don't want to return a shared pointer if your ressource is not already inside a shared pointer: A World of funny things will happen when your shared pointer will get out of scope an delete your Value without telling you...

:-p

Pointers?

If your interface is clear about its ownership of its ressources, and by the fact the returned value can be NULL, then you could return a simple, raw pointer. If the user of your code is dumb enough ignore the interface contract of your object, or to play arithmetics or whatever with your pointer, then he/she will be dumb enough to break any other way you'll choose to return the value, so don't bother with the mentally challenged...

Undefined Value

Unless your Value type really has already some kind of "undefined" value, and the user knows that, and will accept to handle that, it is a possible solution, similar to the pointer or iterator solution.

But do not add a "undefined" value to your Value class because of the problem you asked: You'll end up raising the "references vs. pointer" war to another level of insanity. Code users want the objects you give them to either be Ok, or to not exist. Having to test every other line of code this object is still valid is a pain, and will complexify uselessly the user code, by your fault.

Exceptions

Exceptions are usually not as costly as some people would like them to be. But for a simple accessor, the cost could be not trivial, if your accessor is used often.

For example, the STL std::vector has two accessors to its value through an index:

T & std::vector::operator[]( /* index */ )

and:

T & std::vector::at( /* index */ )

The difference being that the [] is non-throwing . So, if you access outside the range of the vector, you're on your own, probably risking memory corruption, and a crash sooner or later. So, you should really be sure you verified the code using it.

On the other hand, at is throwing. This means that if you access outside the range of the vector, then you'll get a clean exception. This method is better if you want to delegate to another code the processing of an error.

I use personnaly the [] when I'm accessing the values inside a loop, or something similar. I use at when I feel an exception is the good way to return the current code (or the calling code) the fact something went wrong.

So what?

In your case, you must choose:

If you really need a lightning-fast access, then the throwing accessor could be a problem. But this means you already used a profiler on your code to determinate this is a bottleneck, doesn't it?

;-)

If you know that not having a value can happen often, and/or you want your client to propagate a possible null/invalid/whatever semantic pointer to the value accessed, then return a pointer (if your value is inside a simple pointer) or a weak/shared pointer (if your value is owned by a shared pointer).

But if you believe the client won't propagate this "null" value, or that they should not propagate a NULL pointer (or smart pointer) in their code, then use the reference protected by the exception. Add a "hasValue" method returning a boolean, and add a throw should the user try the get the value even if there is none.

Last but not least, consider the code that will be used by the user of your object:

// If you want your user to have this kind of code, then choose either
// pointer or smart pointer solution
void doSomething(MyClass & p_oMyClass)
{
   MyValue * pValue = p_oMyClass.getValue() ;

   if(pValue != NULL)
   {
      // Etc.
   }
}

MyValue * doSomethingElseAndReturnValue(MyClass & p_oMyClass)
{
   MyValue * pValue = p_oMyClass.getValue() ;

   if(pValue != NULL)
   {
      // Etc.
   }

   return pValue ;
}

// ==========================================================

// If you want your user to have this kind of code, then choose the
// throwing reference solution
void doSomething(MyClass & p_oMyClass)
{
   if(p_oMyClass.hasValue())
   {
      MyValue & oValue = p_oMyClass.getValue() ;
   }
}

So, if your main problem is choosing between the two user codes above, your problem is not about performance, but "code ergonomy". Thus, the exception solution should not be put aside because of potential performance issues.

:-)

paercebal
+1  A: 
Augusto Radtke
A good idea, but take a look at boost::optional, it does the same but more cleanly. I’ve elaborated nearby how it can solve your problem.
Roman Odaisky
+4  A: 

The correct answer (according to Alexandrescu) is:

Optional and Enforce

First of all, do use the Accessor, but in a safer way without inventing the wheel:

boost::optional<X> get_X_if_possible();

Then create an enforce helper:

template <class T, class E>
T& enforce(boost::optional<T>& opt, E e = std::runtime_error("enforce failed"))
{
    if(!opt)
    {
        throw e;
    }

    return *opt;
}

// and an overload for T const &

This way, depending on what might the absence of the value mean, you either check explicitly:

if(boost::optional<X> maybe_x = get_X_if_possible())
{
    X& x = *maybe_x;

    // use x
}
else
{
    oops("Hey, we got no x again!");
}

or implicitly:

X& x = enforce(get_X_if_possible());

// use x

You use the first way when you’re concerned about efficiency, or when you want to handle the failure right where it occurs. The second way is for all other cases.

Roman Odaisky
A: 

@aradtke, you said.

I agree with paercebal, an iterator is to iterate. I don't like the way STL does. But the idea of an accessor seems more appealing. So what we need? A container like class that feels like a boolean for testing but behaves like the original return type. That would be feasible with cast operators. [..] Now, any foreseeable problem?

First, YOU DO NOT WANT OPERATOR bool. See Safe Bool idiom for more info. But about your question...

Here's the problem, users need to now explict cast in cases. Pointer-like-proxies (such as iterators, ref-counted-ptrs, and raw pointers) have a concise 'get' syntax. Providing a conversion operator is not very useful if callers have to invoke it with extra code.

Starting with your refence like example, the most concise way to write it:

// 'reference' style, check before use
if (Accessor<type> value = list.get(key)) {
   type &v = value;
   v.doSomething();
}
// or
if (Accessor<type> value = list.get(key)) {
   static_cast<type&>(value).doSomething();
}

This is okay, don't get me wrong, but it's more verbose than it has to be. now consider if we know, for some reason, that list.get will succeed. Then:

// 'reference' style, skip check 
type &v = list.get(key);
v.doSomething();
// or
static_cast<type&>(list.get(key)).doSomething();

Now lets go back to iterator/pointer behavior:

// 'pointer' style, check before use
if (Accessor<type> value = list.get(key)) {
   value->doSomething();
}

// 'pointer' style, skip check 
list.get(key)->doSomething();

Both are pretty good, but pointer/iterator syntax is just a bit shorter. You could give 'reference' style a member function 'get()'... but that's already what operator*() and operator->() are for.

The 'pointer' style Accessor now has operator 'unspecified bool', operator*, and operator->.

And guess what... raw pointer meets these requirements, so for prototyping, list.get() returns T* instead of Accessor. Then when the design of list is stable, you can come back and write the Accessor, a pointer-like Proxy type.

Aaron
+1  A: 

(I realize this is not always the right answer, and my tone a bit strong, but you should consider this question before deciding for other more complex alternatives):

So, what's wrong with returning a pointer?

I've seen this one many times in SQL, where people will do their earnest to never deal with NULL columns, like they have some contagious decease or something. Instead, they cleverly come up with a "blank" or "not-there" artificial value like -1, 9999 or even something like '@X-EMPTY-X@'.

My answer: the language already has a construct for "not there"; go ahead, don't be afraid to use it.

Euro Micelli