views:

419

answers:

4

I am trying to learn and understand name mangling in C++. Here are some questions:

(1) From devx

When a global function is overloaded, the generated mangled name for each overloaded version is unique. Name mangling is also applied to variables. Thus, a local variable and a global variable with the same user-given name still get distinct mangled names.

Are there other examples that are using name mangling, besides overloading functions and same-name global and local variables ?

(2) From Wiki

The need arises where the language allows different entities to be named with the same identifier as long as they occupy a different namespace (where a namespace is typically defined by a module, class, or explicit namespace directive).

I don't quite understand why name mangling is only applied to the cases when the identifiers belong to different namespaces, since overloading functions can be in the same namespace and same-name global and local variables can also be in the same space. How to understand this?

Do variables with same name but in different scopes also use name mangling?

(3) Does C have name mangling? If it does not, how can it deal with the case when some global and local variables have the same name? C does not have overloading functions, right?

Thanks and regards!

+4  A: 

Are there other examples that are using name mangling, besides overloading functions and same-name global and local variables?

C++ mangles all symbols, always. It's just easier for the compiler. Typically the mangling encodes something about the parameter list or types as these are the most common causes of mangling being needed.

C does not mangle. Scoping is used to control access to local and global variables of the same name.

Donnie
Thanks Donnie. Do you think name mangling is only applied to identities with same name but in different namespaces?
Tim
No, as I said, C++ mangles *all* identifiers. Always.
Donnie
+7  A: 

Technically, it's "decorating". It sounds less crude but also mangling sort of implies that CreditInterest might get rearranged into IntCrederestit whereas what actually happens is more like _CreditInterest@4 which is, fair to say, "decorated" more than mangled. That said, I call it mangling too :-) but you'll find more technical info and examples if you search for "C++ name decoration".

Kate Gregory
It actually depends on the compiler. Some of them literally mangle the names into strings that are meaningless except to the compiler. Older versions of VC++ were particularly bad about this. :) But yes, both search terms are valid.
Donnie
I agree; when I started we only ever said mangling, and at some point over the decades decorating became more common, and when I got around to looking at the mangled names, decorating did seem to fit the bill. My guess would be someone changed the way they were doing things and wanted to leave the old name behind as well. Only partly successful though :-)
Kate Gregory
+6  A: 

C does not do name mangling, though it does pre-pend an underscore to function names, so the printf(3) is actually _printf in the libc object.

In C++ the story is different. The history of it is that originally Sroustrup created "C with classes" or cfront, a compiler that would translate early C++ to C. Then rest of the tools - C compiler and linker would we used to produce object code. This implied that C++ names had to be translated to C names somehow. This is exactly what name mangling does. It provides a unique name for each class member and global/namespace function and variable, so namespace and class names (for resolution) and argument types (for overloading) are somehow included in the final linker names.

This is very easy to see with tools like nm(1) - compile your C++ source and look at the generated symbols. The following is on OSX with GCC:

namespace zoom
{
    void boom( const std::string& s )
    {
        throw std::runtime_error( s );
    }
}

~$ nm a.out | grep boom
0000000100001873 T __ZN4zoom4boomERKSs

In both C and C++ local (automatic) variables produce no symbols, but live in registers or on stack.

Edit:

Local variables do not have names in resulting object file for mere reason that linker does not need to know about them. So no name, no mangling. Everything else (that linker has to look at) is name-mangled in C++.

Nikolai N Fetissov
This is even more fun if you have nested template instantiations :-)
James McNellis
Yeh, I usually call it *pain* though ... :)
Nikolai N Fetissov
Thanks Nikolai! As you said "In both C and C++ local (automatic) variables produce no symbols, but live in registers or on stack", are the names of local variables mangled or not in C++? If not, what kinds of variables' names are mangled?
Tim
+3  A: 

Mangling is simply how the compiler keeps the linker happy.

In C, you can't have two functions with the same name, no matter what. So that's what the linker was written to assume: unique names. (You can have static functions in different compilation units, because their names aren't of interest to the linker.)

In C++, you can have two functions with the same name as long as they have different parameter types. So C++ combines the function name with the types in some way. That way the linker sees them as having different names.

Note that it doesn't matter how the name gets mangled, and in fact every compiler does it differently. All that matters is that every function with the same base name is somehow made unique for the linker.

You can see now that adding namespaces and templates to the mix keeps extending the principle.

egrunin