views:

2144

answers:

12

Hi all..

Was reading up a bit on my C++, and found this article about RTTI (Runtime Type Identification): http://msdn.microsoft.com/en-us/library/70ky2y6k(VS.80).aspx . Well, that's another subject :) - However, I stumbled upon a weird saying in the type_info-class, namely about the ::name-method. It says: "The type_info::name member function returns a const char* to a null-terminated string representing the human-readable name of the type. The memory pointed to is cached and should never be directly deallocated."

How can you implement something like this yourself!? Ive been struggling quite a bit with this exact problem often before, as I don't want to make a new char-array for the caller to delete, so I've stuck to std::string thus far.

So, for the sake of simplicity, let's say I want to make a method that returns "Hello World!", let's call it

const char *getHelloString() const;

Personally, I would make it somehow like this (Pseudo):

const char *getHelloString() const
{
  char *returnVal = new char[13];
  strcpy("HelloWorld!", returnVal);

  return returnVal
}

.. But this would mean that the caller should do a delete[] on my return pointer :(

Thx in advance

+16  A: 

How about this:

const char *getHelloString() const
{
    return "HelloWorld!";
}

Returning a literal directly means the space for the string is allocated in static storage by the compiler and will be available throughout the duration of the program.

Greg Rogers
This popped up just as I was about to post an almost identical solution :)
workmad3
Sounds reasonable, but wont that, in practice, be the same as a new-call? I mean, you'd have a string that will be allocated for the whole life of my application? WHat if you call this method tons of times? Or what if the return value aint known at compile-time?
Meeh
The string will be located in static storage. There will be no memory allocation or deallocation done with this string, and also any attempt to delete a string in static storage is undefined behaviour. Calling the method lots of times will only ever return the same pointer to static storage.
workmad3
If the return value isn't known at compile time, your options are more limited. You can't return a static string (obviously). Your best solution then would probably to return the string inside a reference counted smart pointer, or to continue to use std::string (which is ofter reference counted)
workmad3
Christ, this is so obvious that I'm surprised about how many freaky, overcomplicated answers there are being given...
Vicent Marti
@Tanoku - Hah, agreed, headdesk comes to mind. :)
Jim Buck
This is C++ not C. Unless you have done performance testing you should just return a std::string by by value in order to reduce the chance you introduce leaks.(Maybe at one point you want to return "Hello Joe" where based on some session. At that point you would need to change your API.)
James Dean
+1  A: 

I think that since they know that there are a finite number of these, they just keep them around forever. It might be appropriate for you to do that in some instances, but as a general rule, std::string is going to be better.

They can also look up new calls to see if they made that string already and return the same pointer. Again, depending on what you are doing, this may be useful for you too.

Lou Franco
That sounds very likely... So I guess there's no smart way to do this, but in this case, it was a smart move by MS to avoid too much overhead :) Ty
Meeh
A: 

It's probably done using a static buffer:

const char* GetHelloString()
{
    static char buffer[256] = { 0 };
    strcpy( buffer, "Hello World!" );
    return buffer;
}

This buffer is like a global variable that is accessible only from this function.

Martin Cote
this will only work as intended when the function is designed to always return the same string... and in that case a string literal is much better.
smerlin
+2  A: 

Well gee, if we are talking about just a function, that you always want to return the same value. it's quite simple.

const char * foo() 
{
   static char[] return_val= "HelloWorld!";
   return return_val;
}

The tricky bit is when you start doing things where you are caching the result, and then you have to consider Threading,or when your cache gets invalidated, and trying to store thing in thread local storage. But if it's just a one off output that is immediate copied, this should do the trick.
Alternately if you don't have a fixed size you have to do something where you have to either use a static buffer of arbitrary size.. in which you might eventually have something too large, or turn to a managed class say std::string.

const char * foo() 
{
   static std::string output;
   DoCalculation(output);
   return output.c_str();
}

also the function signature

const char *getHelloString() const;

is only applicable for member functions. At which point you don't need to deal with static function local variables and could just use a member variable.

Dan
Hehe, great answer - ty :)
Meeh
Nice greets to the guys who try to use your function in multi-threading contexts after that :-)
rstevens
I did say that Threading gets tricky.
Dan
A: 

You can't rely on GC; this is C++. That means you must keep the memory available until the program terminates. You simply don't know when it becomes safe to delete[] it. So, if you want to construct and return a const char*, simple new[] it and return it. Accept the unavoidable leak.

MSalters
I think you have confused quite a lot of things here. The question isn't wether it's safe to delete the const char* (it isn't since the function he calls explicitly says so), but rather how to create such a function himself.
Mats Fredriksson
Thanks - I'll rephrase
MSalters
This is the worst suggestion ever. The function he talks about is one that "owns" the memory it returns, and therefore the user shouldn't free it. If you yourself create a function that returns a new[] and release ownership (i.e. never delete[] it), it's just bad coding. And entirely avoidable!
Mats Fredriksson
Sorry, you're completely mistaken. X cannot "own" memory if it has no idea how long Y needs that memory. If Y needs that memory potentially until atexit(), YOU WILL LEAK IT. *There* *is* *no* *choice* given those requirements.
MSalters
I suspect the question is how to implement something like strerror or asctime where the leaked memory is bounded. The question could be clearer. However, your suggestion is more like strdup where the caller is responsible for clean up or else the leak is unbounded.
Diastrophism
A: 

Why does the return type need to be const? Don't think of the method as a get method, think of it as a create method. I've seen plenty of API that requires you to delete something a creation operator/method returns. Just make sure you note that in the documentation.

/* create a hello string
 * must be deleted after use
 */
char *createHelloString() const
{
  char *returnVal = new char[13];
  strcpy("HelloWorld!", returnVal);

  return returnVal
}
Dashogun
A: 

What I've often done when I need this sort of functionality is to have a char * pointer in the class - initialized to null - and allocate when required.

viz:

class CacheNameString
{
    private: 
        char *name;
    public:
        CacheNameString():name(NULL)  { }

    const char *make_name(const char *v)
    {
        if (name != NULL)
            free(name);

        name = strdup(v);

        return name;
    }

};
Richard Harrison
A: 

Something like this would do:

const char *myfunction() {
    static char *str = NULL; /* this only happens once */
    delete [] str; /* delete previous cached version */
    str = new char[strlen("whatever") + 1]; /* allocate space for the string and it's NUL terminator */
    strcpy(str, "whatever");
    return str;
}

EDIT: Something that occurred to me is that a good replacement for this could be returning a boost::shared_pointer instead. That way the caller can hold onto it as long as they want and they don't have to worry about explicitly deleting it. A fair compromise IMO.

Evan Teran
`const char* ptr1=myfunction();const char* ptr2=myfunction();/*now the use of ptr1 causes undefined behaviour*/`
smerlin
indeed, i would just say "don't do that" and it'll be fine.
Evan Teran
A: 

Be careful when implementing a function that allocates a chunk of memory and then expects the caller to deallocate it, as you do in the OP:

const char *getHelloString() const
{
  char *returnVal = new char[13];
  strcpy("HelloWorld!", returnVal);

  return returnVal
}

By doing this you are transferring ownership of the memory to the caller. If you call this code from some other function:

int main()
{
  char * str = getHelloString();
  delete str;
  return 0;
}

...the semantics of transferring ownership of the memory is not clear, creating a situation where bugs and memory leaks are more likely.

Also, at least under Windows, if the two functions are in 2 different modules you could potentially corrupt the heap. In particular, if main() is in hello.exe, compiled in VC9, and getHelloString() is in utility.dll, compiled in VC6, you'll corrupt the heap when you delete the memory. This is because VC6 and VC9 both use their own heap, and they aren't the same heap, so you are allocating from one heap and deallocating from another.

John Dibling
A: 

The advice given that warns about the lifetime of the returned string is sound advise. You should always be careful about recognising your responsibilities when it comes to managing the lifetime of returned pointers. The practise is quite safe, however, provided the variable pointed to will outlast the call to the function that returned it. Consider, for instance, the pointer to const char returned by c_str() as a method of class std::string. This is returning a pointer to the memory managed by the string object which is guaranteed to be valid as long as the string object is not deleted or made to reallocate its internal memory.

In the case of the std::type_info class, it is a part of the C++ standard as its namespace implies. The memory returned from name() is actually pointed to static memory created by the compiler and linker when the class was compiled and is a part of the run time type identification (RTTI) system. Because it refers to a symbol in code space, you should not attempt to delete it.

Jon Trauntvein
+2  A: 

I like all the answers about how the string could be statically allocated, but that's not necessarily true for all implementations, particularly the one whose documentation the original poster linked to. In this case, it appears that the decorated type name is stored statically in order to save space, and the undecorated type name is computed on demand and cached in a linked list.

If you're curious about how the Visual C++ type_info::name() implementation allocates and caches its memory, it's not hard to find out. First, create a tiny test program:

#include <cstdio>
#include <typeinfo>
#include <vector>    
int main(int argc, char* argv[]) {
    std::vector<int> v;
    const type_info& ti = typeid(v);
    const char* n = ti.name();
    printf("%s\n", n);
    return 0;
}

Build it and run it under a debugger (I used WinDbg) and look at the pointer returned by type_info::name(). Does it point to a global structure? If so, WinDbg's ln command will tell the name of the closest symbol:

0:000> ?? n
char * 0x00000000`00857290
 "class std::vector<int,class std::allocator<int> >"
0:000> ln 0x00000000`00857290
0:000>

ln didn't print anything, which indicates that the string wasn't in the range of addresses owned by any specific module. It would be in that range if it was in the data or read-only data segment. Let's see if it was allocated on the heap, by searching all heaps for the address returned by type_info::name():

0:000> !heap -x 0x00000000`00857290
Entry             User              Heap              Segment               Size  PrevSize  Unused    Flags
-------------------------------------------------------------------------------------------------------------
0000000000857280  0000000000857290  0000000000850000  0000000000850000        70        40        3e  busy extra fill

Yes, it was allocated on the heap. Putting a breakpoint at the start of malloc() and restarting the program confirms it.

Looking at the declaration in <typeinfo> gives a clue about where the heap pointers are getting cached:

struct __type_info_node {
    void *memPtr;
    __type_info_node* next;
};

extern __type_info_node __type_info_root_node;
...
_CRTIMP_PURE const char* __CLR_OR_THIS_CALL name(__type_info_node* __ptype_info_node = &__type_info_root_node) const;

If you find the address of __type_info_root_node and walk down the list in the debugger, you quickly find a node containing the same address that was returned by type_info::name(). The list seems to be related to the caching scheme.

The MSDN page linked in the original question seems to fill in the blanks: the name is stored in its decorated form to save space, and this form is accessible via type_info::raw_name(). When you call type_info::name() for the first time on a given type, it undecorates the name, stores it in a heap-allocated buffer, caches the buffer pointer, and returns it.

The linked list may also be used to deallocate the cached strings during program exit (however, I didn't verify whether that is the case). This would ensure that they don't show up as memory leaks when you run a memory debugging tool.

bk1e
A: 

I think something like this can only be implemented "cleanly" using objects and the RAII idiom. When the objects destructor is called (obj goes out of scope), we can safely assume that the const char* pointers arent be used anymore.

example code:

class ICanReturnConstChars
{
    std::stack<char*> cached_strings
    public:
    const char* yeahGiveItToMe(){
        char* newmem = new char[something];
        //write something to newmem
        cached_strings.push_back(newmem);
        return newmem;
    }
    ~ICanReturnConstChars(){
        while(!cached_strings.empty()){
            delete [] cached_strings.back()
            cached_strings.pop_back()
        }
    }
};

The only other possibility i know of is to pass a smart_ptr ..

smerlin