views:

363

answers:

7

hello

Suppose the following piece of code

struct S {
    S(int & value): value_(value) {}
    int & value_;
};

S function() {
    int value = 0;
    return S(value);   // implicitly returning reference to local value
}

compiler does not produce warning (-Wall), this error can be hard to catch.

What tools are out there to help catch such problems

+1  A: 

I don't think any static tool can catch that, but if you use Valgrind along with some unit tests or whatever code is crashing (seg fault), you can easily find where the memory is bring referenced and where it was allocated originally.

Dave
While valgrind is _awesome_, as everything is allocated on the stack in this case, valgrind wont catch the problem.
Michael Anderson
+1  A: 

Your code shouldn't even compile. The compilers I know of will either fail to compile the code, or at the very least throw a warning.

If you meant return S(value) instead, then for heavens sake COPY PASTE THE CODE YOU POST HERE.

Rewriting and introducing typos just means it is impossible for us to actually guess which errors you're asking about, and which ones were accidents we're supposed to ignore.

When you post a question anywhere on the internet, if that question includes code, POST THE EXACT CODE.

Now, assuming this was actually a typo, the code is perfectly legal, and there's no reason why any tool should warn you.

As long as you don't try to dereference the dangling reference, the code is perfectly safe.

It is possible that some static analysis tools (Valgrind, or MSVC with /analyze, for example) can warn you about this, but there doesn't seem to be much point because you're not doing anything wrong. You're returning an object which happens to contain a dangling reference. You're not directly returning a reference to a local object (which compilers typically do warn about), but a higher level object with behavior that might make it perfectly safe to use, even though it contains a reference to a local object that's gone out of scope.

jalf
+4  A: 

I think this is not possible to catch all these, although some compilers may give warnings in some cases.

It's as well to remember that references are really pointers under the hood, and many of the shoot-self-in-foot scenarios possible with pointers are still possible..

To clarify what I mean about "pointers under the hood", take the following two classes. One uses references, the other pointers.

class Ref
{
  int &ref;
public:
  Ref(int &r) : ref(r) {};
  int get() { return ref; };
};

class Ptr
{
  int *ptr;
public:
  Ptr(int *p) : ptr(p) {};
  int get() { return *ptr; };
};

Now, compare at the generated code for the two.

@@Ref@$bctr$qri proc    near  // Ref::Ref(int &ref)
    push      ebp
    mov       ebp,esp
    mov       eax,dword ptr [ebp+8]
    mov       edx,dword ptr [ebp+12]
    mov       dword ptr [eax],edx
    pop       ebp
    ret 

@@Ptr@$bctr$qpi proc    near  // Ptr::Ptr(int *ptr)
    push      ebp
    mov       ebp,esp
    mov       eax,dword ptr [ebp+8]
    mov       edx,dword ptr [ebp+12]
    mov       dword ptr [eax],edx
    pop       ebp
    ret 

@@Ref@get$qv    proc    near // int Ref:get()
    push      ebp
    mov       ebp,esp
    mov       eax,dword ptr [ebp+8]
    mov       eax,dword ptr [eax]
    mov       eax,dword ptr [eax]
    pop       ebp
    ret 

@@Ptr@get$qv    proc    near // int Ptr::get()
    push      ebp
    mov       ebp,esp
    mov       eax,dword ptr [ebp+8]
    mov       eax,dword ptr [eax]
    mov       eax,dword ptr [eax]
    pop       ebp
    ret 

Spot the difference? There isn't any.

Roddy
They are absolutely not pointers under the hood, they have no location (they are not indirect), they are an alias.
Steven Jackson
@Steven - It's time you looked under the hood! I've updated the answer.
Roddy
You are right and I have learned something so I thank you, but quoting page 89 of Effective C++ Third Edition, "If you peek under the hood of a C++ compiler, you'll find that references are typically implemented as pointers,". Can someone elaborate?
Steven Jackson
Roddy, Will it be possible for you to share how you generate that assembly code? You can just give some basic pointers as where I can learn about doing it. (that would really help me look under the hood of every c++ line I write)
bits
@bits. It will depend on your toolchain. Usually the compiler will have a command-line switch to "compile to asm". . gcc I think its "-S" option.
Roddy
Necrolis
+5  A: 

There are runtime based solutions which instrument the code to check invalid pointer accesses. I've only used mudflap so far (which is integrated in GCC since version 4.0). mudflap tries to track each pointer (and reference) in the code and checks each access if the pointer/reference actually points to an alive object of its base type. Here is an example:

#include <stdio.h>
struct S {
    S(int & value): value_(value) {}
    int & value_;
};

S function() {
    int value = 0;
    return S(value);   // implicitly returning reference to local value
}
int main()
{
    S x=function();
    printf("%s\n",x.value_); //<-oh noes!
}

Compile this with mudflap enabled:

g++ -fmudflap s.cc -lmudflap

and running gives:

$ ./a.out
*******
mudflap violation 1 (check/read): time=1279282951.939061 ptr=0x7fff141aeb8c size=4
pc=0x7f53f4047391 location=`s.cc:14:24 (main)'
      /opt/gcc-4.5.0/lib64/libmudflap.so.0(__mf_check+0x41) [0x7f53f4047391]
      ./a.out(main+0x7f) [0x400c06]
      /lib64/libc.so.6(__libc_start_main+0xfd) [0x7f53f358aa7d]
Nearby object 1: checked region begins 332B before and ends 329B before
mudflap object 0x703430: name=`argv[]'
bounds=[0x7fff141aecd8,0x7fff141aece7] size=16 area=static check=0r/0w liveness=0
alloc time=1279282951.939012 pc=0x7f53f4046791
Nearby object 2: checked region begins 348B before and ends 345B before
mudflap object 0x708530: name=`environ[]'
bounds=[0x7fff141aece8,0x7fff141af03f] size=856 area=static check=0r/0w liveness=0
alloc time=1279282951.939049 pc=0x7f53f4046791
Nearby object 3: checked region begins 0B into and ends 3B into
mudflap dead object 0x7089e0: name=`s.cc:8:9 (function) int value'
bounds=[0x7fff141aeb8c,0x7fff141aeb8f] size=4 area=stack check=0r/0w liveness=0
alloc time=1279282951.939053 pc=0x7f53f4046791
dealloc time=1279282951.939059 pc=0x7f53f4046346
number of nearby objects: 3
Segmentation fault

A couple of points to consider:

  1. mudflap can be fine tuned in what exactly it should check and do. read http://gcc.gnu.org/wiki/Mudflap_Pointer_Debugging for details.
  2. The default behaviour is to raise a SIGSEGV on a violation, this means you can find the violation in your debugger.
  3. mudflap can be a bitch, in particular when your are interacting with libraries that are not compiled with mudflap support.
  4. It wont't bark on the place where the dangling reference is created (return S(value)), only when the reference is dereferenced. If you need this, then you'll need a static analysis tool.

P.S. one thing to consider was, to add a NON-PORTABLE check to the copy constructor of S(), which asserts that value_ is not bound to an integer with a shorter life span (for example, if *this is located on an "older" slot of the stack that the integer it is bound to). This is higly-machine specific and possibly tricky to get right of course, but should be OK as long it's only for debugging.

Luther Blissett
thank you.I will give it a try.I did not know about this tool before
aaa
+1  A: 

There is a guideline I follow after having been beaten by this exact thing:

When a class has a reference member (or a pointer to something that can have a lifetime you don't control), make the object non-copyable.

This way, you reduce the chances of escaping the scope with a dangling reference.

Alexandre C.
thanks.Unfortunately, I have no control over classis in that library.I was hoping to find static analysis tool
aaa
+1  A: 

You have to use an technology based on compile-time instrumentation. While valgrind could check all function calls at run-time (malloc, free), it could not check just code.

Depending your architecture, IBM PurifyPlus find some of these problem. Therefore, you should find a valid license (or use your company one) to use-it, or try-it with the trial version.

Doomsday
A: 

This is perfectly valid code.

If you call your function and bind the temporary to a const reference the scope gets prolonged.

const S& s1 = function(); // valid

S& s2 = function(); // invalid

This is explicitly allowed in the C++ standard.

See 12.2.4:

There are two contexts in which temporaries are destroyed at a different point than the end of the full-expression.

and 12.2.5:

The second context is when a reference is bound to a temporary. The temporary to which the reference is bound or the temporary that is the complete object of a subobject to which the reference is bound persists for the lifetime of the reference except: [...]

Andreas
S may stay in scope, but the object (an int in this case) referenced by S will go out of scope.
MSN