tags:

views:

5146

answers:

12

This is a somewhat bizarre question. My objectives are to understand the language design decision and to identify the possibilities of reflection in C++.

  1. Why C++ language committee did not go towards implementing reflection in the language? Is reflection too difficult in a language that does not run on a virtual machine (like java)?

  2. If one were to implement reflection for C++, what will be the challenges?

I guess the uses of reflection are well-known: editors can be more easily written, program code will be smaller, mocks can be generated for unit tests and so on. But it would be great if you could comment on uses of reflection too.

+19  A: 

Reflection requires some metadata about types to be stored somewhere that can be queried. Since C++ compiles to native machine code and undergoes heavy changes due to optimization, high level view of the application is pretty much lost in the process of compilation, consequently, it won't be possible to query them at run time. Java and .NET use a very high level representation in the binary code for virtual machines making this level of reflection possible. In some C++ implementations, however, there is something called Run Time Type Information (RTTI) which can be considered a stripped down version of reflection.

Mehrdad Afshari
RTTI is in the C++ standard.
Daniel Earwicker
But not all C++ implementations are standard. I've seen implementations that don't support RTTI.
Mehrdad Afshari
And most implementations that do support RTTI also support turning it off via compiler options.
Michael Kohne
+6  A: 

Reflection can be and has been implemented in c++ before.

It is not a native c++ feature because it have an heavy cost (memory and speed) that should'nt be set by default by the language - the language is "maximum performance by default" oriented.

As you shouldn't pay for what you don't need, and as yous say yourself it's needed more in editors than in other applications, then it should be implemented only where you need it, and not "forced" to all the code (you don't need reflection on all the data you'll work with in a editor or other similar application).

Klaim
and you don't ship symbols because it would allow your customers/competitors to look at your code... this is often considered a bad thing.
gbjbaanb
You're right, i didn't even though about the code exposition problem :)
Klaim
+3  A: 

The reason C++ doesn't have reflection is that this would require the compilers to add symbol information to the object files, like what members a class type has, information about the members, about the functions and everything. This essentially would render include files useless, as information shipped by declarations would then be read from those object files (modules then). In C++, a type definition can occur multiple times in a program by including the respective headers (provided that all those definitions are the same), so it would have to be decided where to put the information about that type, just as to name one complication here. The aggressive optimization done by a C++ compiler, which can optimize out dozens of class template instantiations, is another strong point. It's possible, but as C++ is compatible to C, this would become an awkward combination.

Johannes Schaub - litb
I don't understand how the compiler's aggressive optimization is a strong point. Can you elaborate? If the linker can remove duplicate inline-function definitions, what's the problem with duplicate reflection information? Isn't symbol information added to the object files anyway, for debuggers?
Rob Kennedy
The problem is that your reflection information may be invalid. If the compiler eliminates 80% of your class definitions, what is your reflection metadata going to say? In C# and Java, the language guarantees that if you define a class, it stays defined. C++ lets the compiler optimize it away.
jalf
@Rob, the optimizations are another point, not tied to the multiple classes complication. See @jalf's comment (and his answer) for what i meant.
Johannes Schaub - litb
If I instantiate reflect<T>, then don't throw away any of T's information. This doesn't seem like an unsolvable problem.
Joseph Garvin
+7  A: 

If you really want to understand the design decisions surrounding C++, find a copy of the The Annotated C++ Reference Manual by Ellis and Stroustrup. It's NOT up to date with the latest standard, but it goes through the original standard and explains how things work and often, how they got that way.

Michael Kohne
Also Design and Evolution of C++ by Stroustrup
James Hopkin
+167  A: 

There are several problems with reflection in C++.

  • It's a lot of work to add, and the C++ committee is fairly conservative, and don't spend time on radical new features unless they're sure it'll pay off. (A suggestion for adding a module system similar to .NET assemblies has been made, and while I think there's general consensus that it'd be nice to have, it's not their top priority at the moment, and has been pushed back until well after C++0x. The motivation for this feature is to get rid of the #include system, but it would also be enable at least some metadata).
  • You don't pay for what you don't use. That's one of the must basic design philosophies underlying C++. Why should my code carry around metadata if I may never need it? Moreover, the addition of metadata may inhibit the compiler from opimizing. Why should I pay that cost in my code if I may never need that metadata?
  • Which leads us to another big point: C++ makes very few guarantees about the compiled code. The compiler is allowed to do pretty much anything it likes, as long as the resulting functionality is what is expected. For example, your classes aren't required to actually be there. The compiler can optimize them away, inline everything they do, and it frequently does just that, because even simple template code tends to create quite a few template instantiations. The C++ standard library relies on this aggressive optimization. Functors are only performant if the overhead of instantiating and destructing the object can be optimized away. operator[] on a vector is only comparable to raw array indexing in performance because the entire operator can be inlined and thus removed entirely from the compiled code. C# and Java make a lot of guarantees about the output of the compiler. If I define a class in C#, then that class will exist in the resulting assembly. Even if I never use it. Even if all calls to its member functions could be inlined. The class has to be there, so that reflection can find it. Part of this is alleviated by C# compiling to bytecode, which means that the JIT compiler can remove class definitions and inline functions if it likes, even if the initial C# compiler can't. In C++, you only have one compiler, and it has to output efficient code. If you were allowed to inspect the metadata of a C++ executable, you'd expect to see every class it defined, which means that the compiler would have to preserve all the defined classes, even if they're not necessary.
  • And then there are templates. Templates in C++ are nothing like generics in other languages. Every template instantiation creates a new type. std::vector<int> is a completely separate class from std::vector<float>. That adds up to a lot of different types in a entire program. What should our reflection see? The template std::vector? But how can it, since that's a source-code construct, which has no meaning at runtime? It'd have to see the separate classes std::vector<int> and std::vector<float>. And std::vector<int>::iterator and std::vector<float>::iterator, same for const_iterator and so on. And once you step into template metaprogramming, you quickly end up instantiating hundreds of templates, all of which get inlined and removed again by the compiler. They have no meaning, except as part of a compile-time metaprogram. Should all these hundreds of classes be visible to reflection? They'd have to, because otherwise our reflection would be useless, if it doesn't even guarantee that the classes I defined will actually be there. And a side problem is that the template class doesn't exist until it is instantiated. Imagine a program which uses std::vector<int>. Should our reflection system be able to see std::vector::iterator? On one hand, you'd certainly expect so. It's an important class, and it's defined in terms of std::vector<int>, which does exist in the metadata. On the other hand, if the program never actually uses this iterator class template, its type will never have been instantiated, and so the compiler won't have generated the class in the first place. And it's too late to create it at runtime, since it requires access to the source code.
  • And finally, reflection isn't quite as vital in C++ as it is in C#. The reason is again, template metaprogramming. It can't solve everything, but for many cases where you'd otherwise resort to reflection, it's possible to write a metaprogram which does the same thing at compile-time. boost::type_traits is a simple example. You want to know about type T? Check its type_traits. In C#, you'd have to fish around after its type using reflection. Reflection would still be useful for some things (the main use I can see, which metaprogramming can't easily replace, is for autogenerated serialization code), but it would carry some significant costs for C++, and it's just not necessary as often as it is in other languages.

Edit: In response to comments:

cdleary: Yes, debug symbols do something similar, in that they store metadata about the types used in the executable. But they also suffer from the problems I described. If you've ever tried debugging a release build, you'll know what I mean. There are large logical gaps where you created a class in the source code, which has gotten inlined away in the final code. If you were to use reflection for anything useful, you'd need it to be more reliable and consistent. As it is, types would be vanishing and disappearing almost every time you compile. You change a tiny little detail, and the compiler decides to change which types get inlined and which ones don't, as a response. How do you extract anything useful from that, when you're not even guaranteed that the most relevant types will be represented in your metadata? The type you were looking for may have been there in the last build, but now it's gone. And tomorrow, someone will check in a small innocent change to a small innocent function, which makes the type just big enough that it won't get completely inlined, so it'll be back again. That's still useful for debug symbols, but not much more than that. I'd hate trying to generate serialization code for a class under those terms.

Evan Teran: Of course these issues could be resolved. But that falls back to my point #1. It'd take a lot of work, and the C++ committee has plenty of things they feel is more important. Is the benefit of getting some limited reflection (and it would be limited) in C++ really big enough to justify focusing on that at the expense of other features? Is there really a huge benefit in adding features the core language which can already (mostly) be done through libraries and preprocessors like QT's? Perhaps, but the need is a lot less urgent than if such libraries didn't exist. For your specific suggestions though, I believe disallowing it on templates would make it completely useless. You'd be unable to use reflection on the standard library, for example. What kind of reflection wouldn't let you see a std::vector? Templates are a huge part of C++. A feature that doesn't work on templates is basically useless.

But you're right, some form of reflection could be implemented. But it'd be a major change in the language. As it is now, types are exclusively a compile-time construct. They exist for the benefit of the compiler, and nothing else. Once the code has been compiled, there are no classes. If you stretch yourself, you could argue that functions still exist, but really, all there is is a bunch of jump assembler instructions, and a lot of stack push/pop's. There's not much to go on, when adding such metadata.

But like I said, there is a proposal for changes to the compilation model, adding self-contained modules, storing metadata for select types, allowing other modules to reference them without having to mess with #includes. That's a good start, and to be honest, I'm surprised the standard committee didn't just throw the proposal out for being too big a change. So perhaps in 5-10 years? :)

jalf
i think this is a great answer. +1 indeed
Johannes Schaub - litb
Don't most of these issues already have to be solved by debug symbols? Not that it would be performant (because of the inlining and optimization you mentioned), but you could allow for the *possibility* of reflection by doing whatever debug symbols do.
cdleary
fantastic answer, though I believe all the technical issues could be resolved. You could make it a keyword which enables reflection on a class. you could get compile time errors if reflection is used on a non keyworded class. Also you could disallow it being combined with templates.
Evan Teran
Finally, there could be a rule that classes which enable reflection must not be optimized away to nothing. So it would add an expense, *if* you asked for it.
Evan Teran
These rules I mentioned pretty much cover what QT's MOC system does.
Evan Teran
Added responses to your comments in the main post. :)But in short, yeah, debug symbols do a bit of the same, on a small scale, and of course the technical issues *could* be resolved.
jalf
"As it is now, types are exclusively a compile-time construct" - Generally this answer is good for getting across the intended spirit of C++ and the reason why RTTI is a last resort, but it IS there in the standard, and it already addresses some of your issues - e.g. interaction with templates.
Daniel Earwicker
+1 This is a fantastic answer - nicely done!
Andrew Hare
+1 Amazing answer!
Eduardo León
+1 - but like earwicker commented, polymorphic types do end up having RTTI - and a limited form of reflection is possible using the native typeid operator - the answer would be even better if it touched upon that :)
Faisal Vali
Perhaps. I don't really consider RTTI reflection though. Reflection gives you information about a type. The best RTTI can do is tell you what type a value has. It doesn't say anything *about* the type.Apart from that, I don't want to edit my post, because it might get community-wikied. ;)
jalf
Another thing about your first point: as far as I know nobody's tried adding reflection to a C++ implementation. There's no good experience with it. The Committee is probably going to be reluctant to take the lead, particularly after `export` and `vector<bool>`.
David Thornley
One another point I'd like to add to your answer (great answer btw), is that even if one compiler decides to optimize in a certain way there is no guarantee that another compiler would do it in the exact same way, so you can't write portable code (portable for being compilable across compilers in different platforms that is).
Murali VP
C++ should NOT have reflection. That is the wrong paradigm for this language. Like adding a baby nursery to a US Navy Destroyer. Baby nurseries are great, but they have no place on a DD. None.
Mordachai
Oh I just love the _You don't pay for what you don't use._ rule! Thanks for the thoughtful answer Jalf!
legends2k
+1  A: 

some good links on reflection in C++ I just found:

Working Paper of C++ Standard: Aspects of Reflection in C++

A simple example of reflection using templates

Amit Kumar
+2  A: 

Reflection for langauges that have it is about how much of the source code the compiler is willing to leave in your object code to enable reflection, and how much analysis machinery is available to interpret that reflected information. Unless the compiler keeps all the source code around, reflection will be limited in its ability to analyze the available facts about the source code.

The C++ compiler doesn't keep anything around (well, ignoring RTTI), so you don't get reflection in the langauge. (Java and C# compilers only keep class, method names and return types around, so you get a little bit of reflection data, but you can't inspect expressions or program structure, and that means even in those "reflection-enabled" languages the informration you can get is pretty sparse and consequently you really can't do much analysis).

But you can step outside the language and get full reflection capabilities. The answer to another stack overflow discussion on reflection in C discusses this.

Ira Baxter
A: 

Reflection in C++ , I believe is crucially important if C++ is to be used as a language for Database Access, Web session handling/http and GUI development. The lack of reflection prevents ORMs (like Hibernate or LINQ), XML and JSON parsers that instancinate classes, Data serialization and many other thigns (where initially typeless data has to be used to create an instance of a class).

A compile time switch available to a software developer during the build process can be used to eliminate this 'you pay for what you use' concern.

I a firmwaredeveloper does not need the reflection to read data from a serial port -- then fine do not use the switch. But as a database developer who wants to keep using C++ I am constantly phased with a horrible, difficult to maintain code that maps Data between data members and database constructs.

Neither Boost serialization nor other mechanism are really solving the reflection -- it must be done by the compiler -- and once it is done C++ will be again tought in schools and used in software that are dealing with data processing

To me this issue #1 (and naitive threading primitives is issue #2).

Personally, I don't see why reflection *should* be in C++. I can see a stronger case for threading, but not the mass of overhead caused by reflection.
Paul Nathan
Who said C++ *is* to be used as a language for DB Access, Web session hnadling or gui dev? There are plenty of far better languages to use for that kind of stuff. And a compile-time switch won't solve the problem. Usually, the decision to enable or disable reflection will not be on a per-file basis. It could work if it could be enabled on individual types. If the programmer can specify with an attribute or similar when defining a type, whether or not reflection metadata for it should be generated. But a global switch? You'd be crippling 90% of the lnaguage just to make 10% of it simpler.
jalf
A: 

Ira, than you for the discussion. I have actually followed your work on source code analysis and semantic meaning of the source code since 2000. I thought even at that time your tools are invaluable for proving programm correctness (which is what I wanted to do back than).

If C++ would have

a) class member data variable name, variable type, const modifier b) the same as a for function arguments (only position instead of name) c) class member function name, return type, const modifier d) list of parent classes (in the same order as defined)

e) for templated data member and parent casses the expanded template (meaning the actual type would be available for the reflection API and not the 'templated information of how to get there')

That would be enough to create very easy to use libraries that at at the 'crust' of the typeless dataprocessing that is so prevalent in todays web and database applications (all the orms, messaging mechanisms, xml/json parsers, data serialization/etc)

For example, the basic information supported by the Q_PROPERTY macro (mart of Qt Framework) http://qt.nokia.com/doc/4.5/properties.html expanded to cover class methods and e) - would be extraordinary beneficial to C++ and to the software community in general.

Certainly reflection I am asking about would not cover the semantic meaning or more complex issues (like comments source code line numbers, data flow analysist/etc) - but neither do I think those are needed to be part of a language standard.

Vlad

A: 

According to Alistair Cockburn, subtyping can't be guaranteed in a reflective environment.

Reflection is more relevant to latent typing systems. In C++, you know what type you've got and you know what you can do with it.

Nilone
A: 

All languages should not try to incorporate every feature of every other language.

C++ is essentially a very, very sophisticated macro assembler. It is NOT a high-level language like C#, Java, Objective-C, Smalltalk, etc.

It is good to have different tools for different jobs. If we only have hammers, all things are going to look like nails, etc. Having script languages is useful for some jobs, and reflective OO-languages (Java, Obj-C, C#) are useful for another class of jobs, and super-efficient bare-bones close-to-the-machine languages are useful for yet another class of jobs (C++, C, Assembler).

C++ does an amazing job of extending Assembler technology to incredible levels of complexity management, and abstractions to make programming larger, more complex tasks vastly more possible for human beings. But it is not a language that should be used by those who are approaching their problem from a strictly high-level perspective (Lisp, Smalltalk, Java, C#). If you need a language with those features to best implement a solution to your problems, then thank those who've created such languages for all of us to use!

But C++ is for those who, for whatever reason(s), need to have a strong correlation between their code and the underlying machine's operation. Whether its efficiency, or programming device drivers, or interaction with the lower-level OS services, or whatever, C++ is better suited to those tasks.

C#, Java, Objective-C all require a runtime system to support their execution. That runtime has to be delivered to the system in question - preinstalled to support the operation of your software. And that layer has to be maintained for various target systems, customized by SOME OTHER LANGUAGE to make it work on that platform. And that middle layer - that adaptive layer between the host OS and the your code - the runtime, is almost always written in a language like C++ where efficiency is #1, where understanding predictably the exact interaction between software and hardware can be well understood, and manipulated to maximum gain.

I love Smalltalk, Objective-C, and having a runtime system with reflection / meta-data. Amazing code can be written! But that's simply a higher layer on the stack, a layer that must rest on lower layers, that themselves must ultimately sit upon the OS and the hardware. And we will always need a language that is best suited for building that layer: C++/C/Assembler.

Mordachai