views:

621

answers:

3

Apparently (at least according to gcc -std=c99) C99 doesn't support function overloading. The reason for not supporting some new feature in C is usually backward compatibility, but in this case I can't think of a single case in which function overloading would break backward compatibility. What is the reasoning behind not including this basic feature?

+17  A: 

When you compile a C source, symbol names will remain intact. If you introduce function overloading, you should provide a name mangling technique to prevent name clashes. Consequently, like C++, you'll have machine generated symbol names in the compiled binary.

Also, C does not feature strict typing. Many things are implicitly convertible to each other in C. The complexity of overload resolution rules could introduce confusion in such kind of language.

Mehrdad Afshari
It seems to me that the response to this is simple: provide a standard name mangling technique to keep the type information. Something like `foo(bar) -> foo-bar` and `foo(baz) -> foo-baz` is not hard to implement.
Imagist
I mentioned three points actually: 1. weak typing 2. increased complexity 3. name clash. You could build a standard name mangling technique but how would you make it compatible with libraries already compiled under C89 and if you build a library in C99, how could you use it in C89?
Mehrdad Afshari
+10  A: 

To understand why you aren't likely to see overloading in C, it might help to better learn how overloading is handled by C++.

After compiling code, but before it is ready to run, the intermediate object code must be linked. This transforms a rough database of compiled functions and other objects into a ready to load/run binary file. This extra step is important because it is the principle mechanism of modularity available to compiled programs. This step allows you to take code from existing libraries and mix it with your own application logic.

At this stage, the object code may have been written in any language, with any combination of features. To make this possible, it's necessary to have some sort of convention so that the linker is able to pick the right object when another object refers to it. If you're coding in assembly language, when you define a label, that label is used exactly, because it is assumed you know what you're doing.

In C, functions become the symbol names for the linker, so when you write

int main(int argc, char **argv) { return 1; }

the compiler provides an archive of object code, which contains an object called main.

This works well, but it means that you cannot have two objects with the same name, because the linker would be unable to decide which name it should use. The linker doesn't know anything about argument types, and very little about code in general.

C++ resolves this by encoding additional information into the symbol name directly. The return type and the number and type of arguments are added to the symbol name, and are referred to that way at the point of a function call. The linker doesn't have to know this is even happening, since as far as it can tell, the function call is unambiguous.

The downside of this is that the symbol names don't look anything like the original function names. In particular, it's almost impossible to predict what the name of an overloaded function will be so that you can link to it. To link to foriegn code, you can use extern "C", which causes those functions to follow the C style of symbol names, but of course you cannot overload such a function.

These differences are related to the design goals of each language. C is oriented toward portability and interoperability. C goes out of its way to do predictable and compatible things. C++ is more strongly oriented toward building rich and powerful systems, and not terribly focused on interacting with other languages.

I think it is unlikely for C to ever pursue any feature that would produce code that is as difficult to interact with as C++.

Edit: Imagist asks:

Would it really be less portable or more difficult to interact with a function if you resolved int main(int argc, char** argv) to something like main-int-int-char** instead of to main (and this were part of the standard)? I don't see a problem here. In fact, it seems to me that this gives you more information (which could be used for optimization and the like)

To answer this, I will turn again to C++ and the way it deals with overloads. C++ uses this mechanism, almost exactly as described, but with one caveat. C++ does not standardize how certain parts of itself should be implemented, and then goes on to suggest how some of the consequences of that omission. In particular, C++ has a rich type system, that includes virtual class members. How this feature should be implemented is left to the compiler writers, and the details of vtable resolution has a strong effect on function signatures. For this reason, C++ deliberately suggests compiler writers make name mangling be mutually incompatible across compilers or same compilers with different implementations of these key features.

This is just a symptom of the deeper issue that while higher level languages like C++ and C have detailed type systems, the lower level machine code is totally typeless. arbitrarily rich type systems are built on top of the untyped binary provided at the machine level. Linkers do not have access to the rich type information available to the higher level languages. The linker is completely dependent on the compiler to handle all of the type abstractions and produce properly type-free object code.

C++ does this by encoding all of the necessary type information in the mangled object names. C, however, has a significantly different focus, aiming to be a sort of portable assembly language. C prefers thus to have a strict one to one correspondence between the declared name and the resulting objects' symbol name. If C Mangled it's names, even in a standardized and predictable way, you would have to go to great efforts to match the altered names to the desired symbol names, or else you would have to turn it off as you do in c++. This extra effort comes at almost no benefit, because unlike C++, C's type system is fairly small and simple.

At the same time, it's practically a standard practice to define several similarly named C functions that vary only by the types they take as arguments. for a lengthy example of just this, have a look at the OpenGL namespace.

TokenMacGuy
Would it really be less portable or more difficult to interact with a function if you resolved `int main(int argc, char** argv)` to something like `main-int-int-char**` instead of to `main` (and this were part of the standard)? I don't see a problem here. In fact, it seems to me that this gives you *more* information (which could be used for optimization and the like).
Imagist
@Imagist - It would be useful to a language. It would also instantly render all older C libraries unusable by newer C code, and all newer C libraries unusable by older C code.
Chris Lutz
@Chris Lutz True, but as a language designer you have to be willing to break backward compatibility occasionally. With a language like C it shouldn't happen often, but is once every ten years so much to ask? In many cases we are stuck with design decisions that may have been good, but aren't good now, because of your mentality.
Imagist
in the case of C, breaking code every 10 years is very, very frequent. You could be certain that adoption would be very small.
TokenMacGuy
+5  A: 

Lots of language designers, including me, think that the combination of function overloading with C's implicit promotions can result in code that is heinously difficult to understand. For evidence, look at the body of knowledge accumulated about C++.

In general, C99 was intended to be a modest revision largely compatible with existing practice. Overloading would have been a pretty big departure.

Norman Ramsey
Lots of language designers, including me, think that direct access to memory through pointers can result in code that is heinously difficult to understand. The solution is obviously not to exclude pointers from the language (at least not in the case of C).
Imagist
@Imagist. I agree with you. I would rather write my operating systems in Modula-3, which provides better mechanisms for controlling and understanding pointer code. But pointer programs do have to be written, and after reading Dennis Ritchie's article in HOPL-II, I believe that C occupies that niche so well that it will never be replaced. But people should stop writing applications in it! (And thanks to Java, many have done just that.)
Norman Ramsey
I think C implicit conversions make the code hard enough to understand without overloading! +1
TokenMacGuy