views:

752

answers:

8

I was looking through the plans for C++0x and came upon std::initializer_list for implementing initializer lists in user classes. This class could not be implemented in C++ without using itself, or else using some "compiler magic". If it could, it wouldn't be needed since whatever technique you used to implement initializer_list could be used to implement initializer lists in your own class.

What other classes require some form of "compiler magic" to work? Which classes are in the Standard Library that could not be implemented by a third-party library?

Edit: Maybe instead of implemented, I should say instantiated. It's more the fact that this class is so directly linked with a language feature (you can't use initializer lists without initializer_list).

A comparison with C# might clear up what I'm wondering about: IEnumerable and IDisposable are actually hard-coded into language features. I had always assumed C++ was free of this, since Stroustrup tried to make everything implementable in libraries. So, are there any other classes / types that are inextricably bound to a language feature.

+5  A: 

The only other one I could think of was the type_info class returned by typeid. As far as I can tell, VC++ implements this by instantiating all the needed type_info classes statically at compile time, and then simply casting a pointer at runtime based on values in the vtable. These are things that could be done using C code, but not in a standard-conforming or portable way.

Eclipse
But isn't it typeid which is magic rather than type_info?
fizzer
+1  A: 

All classes in the standard library, by definition, must be implemented in C++. Some of them hide some obscure language/compiler constructs, but still are just wrappers around that complexity, not language features.

dguaraglia
How would you implement either of those without some compiler help?
Eclipse
If the code relies on compiler constructs that are not part of the standard, it is not C++.
Jonathan
Think std::string as an example. At face value, it looks magic, but you could actually implement it yourself without too much headache.
Daniel Spiewak
That's why initializer_list and type_info are so weird. As far as I can tell, everything else can be implemented by any user.
Eclipse
Untrue, and unsourced. Type_traits in fact will require compiler help. And std::printf()/std::cout somehow need to access the OS, which often is not C++ code either.
MSalters
Even `offsetof` can't be implemented using only C++. It must use compiler builtins on strict compilers.
Johannes Schaub - litb
printf/cout don't need access to the OS. The standard says pretty much nothing about what those streams must be connected to. I'm pretty sure the meaning of "standard output" is implementation-defined. So you could write a compliant implementation of printf which just threw away its output.
jalf
A: 

I think you're pretty safe on this score. C++ mostly serves as a thick layer of abstraction around C. Since C++ is also a superset of C itself, the core language primitives are almost always implemented sans-classes (in a C-style). In other words, you're not going to find many situations like Java's Object which is a class which has special meaning hard-coded into the compiler.

Daniel Spiewak
I'm wondering which situations those are, obviously type_info and initializer_list do have special meaning hard-coded into the compiler.
Eclipse
A: 

Again from C++0x, I think that threads would not be implementable as a portable library in the hypothetical language "C++0x, with all the standard libraries except threads".

[Edit: just to clarify, there seems to be some disagreement as to what it would mean to "implement threads". What I understand it to mean in the context of this question is:

1) Implement the C++0x threading specification (whatever that turns out to be). Note C++0x, which is what I and the questioner are both talking about. Not any other threading specification, such as POSIX.

2) without "compiler magic". This means not adding anything to the compiler to help your implementation work, and not relying on any non-standard implementation details (such as a particular stack layout, or a means of switching stacks, or non-portable system calls to set a timed interrupt) to produce a thread library that works only on a particular C++ implementation. In other words: pure, portable C++. You can use signals and setjmp/longjmp, since they are portable, but my impression is that's not enough.

3) Assume a C++0x compiler, except that it's missing all parts of the C++0x threading specification. If all it's missing is some data structure (that stores an exit value and a synchronisation primitive used by join() or equivalent), but the compiler magic to implement threads is present, then obviously that data structure could be added as a third-party portable component. But that's kind of a dull answer, when the question was about which C++0x standard library classes require compiler magic to support them. IMO.]

Steve Jessop
you can implement threads using C++, it's not even that hard. Of course it's nicer to let the OS do the work but just about any language can implement some form of threads without OS support, outside of calls to processor-level interrupts and timers.
tloach
Maybe I'm just being dim, but I don't see how it can be done without support from the runtime, or calls to POSIX or other OS functions. Obviously you can implement a threaded OS in C++, complete with its own C++ compiler, but that's not what the questioner means.
Steve Jessop
You can implement threads in C/C++ without using operating system threads; you end up putting wrappers around the system calls and multiplexing the higher level threads on a single OS thread. See early Unix threading libraries before threads were in the underlying OS (eg. FreeBSD 4)
janm
"putting wrappers around the system calls" sounds to me like something non-portable. Or it could be that we aren't understanding the same thing by the question: did FreeBSD really implement threads without jigging the compiler at all?
Steve Jessop
e.g, how do you know a hypothetical "C++0x-without-threads" compiler won't reorder instructions in a way that would, in "C++0x-with-threads", invalidly create a data race? Calls with memory barriers (or permit deschedule if no pre-emption) need compiler magic to prevent re-ordering across them, no?
Steve Jessop
No, reordering and races are not a problem. There is only one real thread (ie: the old fashioned process). The additional code is just there to make it look like threads to the application developer.
janm
On portability: Yes, it is non-portable, but not because of the wrapping. And now that I recall the implementation, there was some non-C++ code that manipulated the stack pointer when a "context switch" occurred.
janm
My basic point is that you can implement something that looks, to a programmer, like threading without OS support. All the work can be done in the user process; of course this includes the possibility of changing the runtime.
janm
If you make system calls, change the runtime, and rely on implementation details like stack pointer, you aren't doing what I think the questioner asked, which is to implement X using only "C++ without X". If I ask "how do I do X in C++", I don't expect to hear "oh, you rewrite the compiler"...
Steve Jessop
So while what you say is true (it's possible to implement threads for C++ programmers, on an OS that doesn't support threads, by doing X, Y and Z to the C++ implementation), I don't think that's the same as "implementing the thread classes in a third-party library".
Steve Jessop
By the way, I have in the past worked on a system which implemented a multi-tasking environment on top of a single host process/thread. It had its own kernel, scheduler, etc. We couldn't have written the whole thing in pure C/C++ even if we'd wanted to, and we needed our own compiler for it.
Steve Jessop
... admittedly for reasons other than just to support threading - for instance it implemented processes as well as threads, so obviously needed complete control of the C runtime in order to free resources at process exit. Native-compiled code simply couldn't fully inter-operate.
Steve Jessop
The compiler and the runtime are not the same thing. The questioner didn't mention threads, and threads are not part of the current C++ standard. What thread classes do you mean? If you push it you could come close to replacing the stack manipulations with calls to setjmp and longjmp.
janm
There is no need to "rewrite the compiler" to implement threads. The first commenter (tloach) said that it is easy because it has been done many times before. There are existence proofs. If you ended up rewriting the compiler to support threads, you probably got it wrong.
janm
... Or you were working in a particular environment. If you have to write a "kernel" I have to ask what there means and in what context that was written. For a long time threads (in a Unix context) meant threads to the application rather than concurrent operation at the OS level.
janm
Note that when _exit (or its Win32 equivalent TerminateProcess(), or whatever equivalent on whatever other OS) is called, stacks are not unwound and destructors are not called. There is no need to deal with that stuff on process termination on a modern OS.
janm
I *know* the questioner didn't mention threads, I was suggesting something else in the C++0x standard that fits the bill, as he asked in the question: "what *other* classes can't be implemented".
Steve Jessop
And as for "has been done many times before", no, I don't think it has, because there's never a practical need to implement this stuff in pure portable C++.
Steve Jessop
And you do have to do cleanup on pseudo-process exit, just not stack unwinding. For example you must free malloced blocks (perhaps by releasing a whole heap) and flush file descriptors and streams. I don't think this conversation is really getting anywhere, sorry: we're at complete cross purposes.
Steve Jessop
No. Threads are not in the current C++ standard. User mode threads have been done many times before, regardless of your assertion. Pseudo process exit? On a real process exit, WIN32, Unix and others, memory and handles are released; no need for the application to do anything. Where is this not so?
janm
AAAARGH! Because the system implemented multiple processes on top of a single underlying process, as I said quite clearly. So there was no "real process exit" when one of our processes exited, any more than in a thread implementation there's a real context switch at a pseudo-reschedule.
Steve Jessop
And I'm obviously not talking about the current C++ standard, and neither is the questioner, since we both clearly mention we're talking about draft C++0x. If you insist on talking only about C++03, why are you doing it in this thread?
Steve Jessop
And I have never asserted that threads haven't been implemented. That would be dumb. I have asserted that they haven't been implemented in pure portable C++, which is what I understand the questioner to mean by doing something without any "compiler magic".
Steve Jessop
Having a discussion like this in 300 character blocks is extremely frustrating. Real points get lost in the attempt to be extremely brief and clarity is also a casualty. On current vs. C++0x, yes I forgot the context.
janm
Agreed - I think this is the second discussion I've been in where I really haven't been able to explain myself properly. I guess that's because we aren't supposed to be having the discussion on SO in the first place :-)
Steve Jessop
+1  A: 

Anything that the runtime "hooks into" at defined points is likely not to be implementable as a portable library in the hypothetical language "C++, excluding that thing".

So for instance I think atexit() in <cstdlib> can't be implemented purely as a library, since there is no other way in C++ to ensure it is called at the right time in the termination sequence, that is before any global destructor.

Of course, you could argue that C features "don't count" for this question. In which case std::unexpected may be a better example, for exactly the same reason. If it didn't exist, there would be no way to implement it without tinkering with the exception code emitted by the compiler.

[Edit: I just noticed the questioner actually asked what classes can't be implemented, not what parts of the standard library can't be implemented. So actually these examples don't strictly answer the question.]

Steve Jessop
Throwing exceptions are also magic. But the exception classes are as normal as they come.
Max Lybbert
Indeed. But 'throw' is a language feature, not part of the standard library, so there's perhaps more expectation that it would be orthogonal to the rest of the language.
Steve Jessop
I should perhaps be clear that by "exception code" in the above I don't mean the code of the exceptions classes. I mean the code which ensures that the registered unexpected function is called when an unexpected exception is thrown. That's hidden away in the compiler out of reach of regular code.
Steve Jessop
+4  A: 

std::type_info is a simple class, although populating it requires typeinfo: a compiler construct.

Likewise, exceptions are normal objects, but throwing exceptions requires compiler magic (where are the exceptions allocated?).

The question, to me, is "how close can we get to std::initializer_lists without compiler magic?"

Looking at wikipedia, std::initializer_list<typename T> can be initialized by something that looks a lot like an array literal. Let's try giving our std::initializer_list<typename T> a conversion constructor that takes an array (i.e., a constructor that takes a single argument of T[]):

namespace std {
     template<typename T> class initializer_list {
         T internal_array[];
         public:
         initializer_list(T other_array[]) : internal_array(other_array) { };

         // ... other methods needed to actually access internal_array
     }
}

Likewise, a class that uses a std::initializer_list does so by declaring a constructor that takes a single std::initializer_list argument -- a.k.a. a conversion constructor:

struct my_class {
    ...
    my_class(std::initializer_list<int>) ...
}

So the line:

 my_class m = {1, 2, 3};

Causes the compiler to think: "I need to call a constructor for my_class; my_class has a constructor that takes a std::initializer_list<int>; I have an int[] literal; I can convert an int[] to a std::initializer_list<int>; and I can pass that to the my_class constructor" (please read to the end of the answer before telling me that C++ doesn't allow two implicit user-defined conversions to be chained).

So how close is this? First, I'm missing a few features/restrictions of initializer lists. One thing I don't enforce is that initializer lists can only be constructed with array literals, while my initializer_list would also accept an already-created array:

int arry[] = {1, 2, 3};
my_class = arry;

Additionally, I didn't bother messing with rvalue references.

Finally, this class only works as the new standard says it should if the compiler implicitly chains two user-defined conversions together. This is specifically prohibited under normal cases, so the example still needs compiler magic. But I would argue that (1) the class itself is a normal class, and (2) the magic involved (enforcing the "array literal" initialization syntax and allowing two user-defined conversions to be implicitly chained) is less than it seems at first glance.

Max Lybbert
my_class m = {1,2,3} will not work since it requires two user defined casts and the standard only allows one UDC to be called automatically.
Motti
That's why I wrote "The only magic involved is the compiler is allowed to implicitly use two user-defined conversions, which is not generally allowed." right above the line. Later I added the "wait, this doesn't have all the restrictions the standard asks for" part below the line.
Max Lybbert
I'm a little confused here, the UDC is from {1,2,3} to initializer_list correct? This has to be built into the compiler, there's no way it would be able to see the relation otherwise, plus as far as I can tell, the actual steps involved in creating the array must also be built into the compiler, as there's no public method in initializer_list that can accept an array. I may just be misreading what you're saying, but I'm feeling pretty confused at the moment.
Dave
Thanks, edited.
Max Lybbert
+1  A: 

C++ allows compilers to define otherwise undefined behavior. This makes it possible to implement the Standard Library in non-standard C++. For instance, "onebyone" wonders about atexit(). The library writers can assume things about the compiler that makes their non-portable C++ work OK for their compiler.

MSalters
This is the thing - I think you can implement atexit() using "C++, without atexit(), but plus a bunch of compiler-specific magic". You can't implement it using "C++, without atexit, full stop". But it's a very simple observation that you can implement *anything* with compiler magic to support you...
Steve Jessop
+1  A: 

MSalter points out printf/cout/stdout in a comment. You could implement any one of them in terms of the one of the others (I think), but you can't implement the whole set of them together without OS calls or compiler magic, because:

  1. These are all the ways of accessing the process's standard output stream. You have to stuff the bytes somewhere, and that's implementation-specific in the absence of these things. Unless I've forgotten another way of accessing it, but the point is you can't implement standard output other than through implementation-specific "magic".

  2. They have "magic" behaviour in the runtime, which I think could not be perfectly imitated by a pure library. For example, you couldn't just use static initialization to construct cout, because the order of static initialization between compilation units is not defined, so there would be no guarantee that it would exist in time to be used by other static initializers. stdout is perhaps easier, since it's just fd 1, so any apparatus supporting it can be created by the calls it's passed into when they see it.

Steve Jessop
`cout` is initialized by means of a Schwarz counter. Given a working printf, it’s rather straightforward to implement std::cout.
Roman Odaisky
Good point, but for hitch-free portable operation, the counter should be given a reserved name, so it doesn't clash with anything the user might define. Using reserved names is a (very mild) form of compiler magic.
Steve Jessop
Of course by that argument, defining anything in std:: is compiler magic, so it's impossible to implement *any* standard library feature in pure user code ;-). I guess you just draw a line in the sand what "magic" means.
Steve Jessop