views:

309

answers:

2

I'm a little curious about some of the new features of C++0x. In particular range-based for loops and initializer lists. Both features require a user-defined class in order to function correctly.

I came accross this post, and while the top-answer was helpful. I don't know if it's entirely correct (I'm probably just completely misunderstanding, see 3rd comment on first answer). According to the current specifications for initializer lists, the header defines one type:

template<class E> class initializer_list {
public:
    initializer_list();

    size_t size() const; // number of elements
    const E* begin() const; // first element
    const E* end() const; // one past the last element
};

You can see this in the specifications, just Ctrl + F *'class initializer_list'*.

In order for = {1,2,3} to be implicitly casted into the initializer_list class, the compiler HAS to have some knowledge of the relationship between {} and initializer_list. There is no constructor that receives anything, so the initializer_list as far as I can tell is a wrapper that gets bound to whatever the compiler is actually generating.

It's the same with the for( : ) loop, which also requires a user-defined type to work (though according to the specs, updated to not require any code for arrays and initializer lists. But initializer lists require <initializer_list>, so it's a user-defined code requirement by proxy).

Am I misunderstanding completely how this works here? I'm not wrong in thinking that these new features do infact rely extremely heavily on user code. It feels as if the features are half-baked, and instead of building the entire feature into the compiler, it's being half-done by the compiler and half done in includes. What's the reason for this?

Edit: I typed 'rely heavily on compiler code', and not 'rely heavily on user code'. Which I think completely threw off my question. My confusion isn't about new features being built into the compiler, it's things that are built into the compiler that rely on user code.

+1  A: 

It's just syntax sugar. The compiler will expand the given syntactic constructs into equivalent C++ expressions that reference the standard types / symbol names directly.

This isn't the only strong coupling that modern C++ compilers have between their language and the "outside world". For example, extern "C" is a bit of a language hack to accommodate C's linking model. Language-oriented ways of declaring thread-local storage implicitly depend on lots of RTL hackery to work.

Or look at C. How do you access arguments passed via ...? You need to rely on the standard library; but that uses magic that has a very hard dependency on how exactly the C compiler lays out stack frames.

UPDATE:

If anything, the approach C++ has taken here is more in the spirit of C++ than the alternative - which would be to add an intrinsic collection or range type, baked in to the language. Instead, it's being done via a vendor-defined range type. I really don't see it as much different to variadic arguments, which are similarly useless without the vendor-defined accessor macros.

Barry Kelly
... and extern "C" are language features though, you don't need to write the class that expands ... into arguments, you just use ... to indicate what the compiler should do. {} on the other hand requires a class named initializer_list, if that doesn't exist it won't work, you could compile function( int a,... ) with no includes whatsoever.
Dave
Dave, thread_local won't work without guts in the RTL either, nor will exception-dispatching support. I think you've just been living a sheltered life :) - the compiler is doing more work with the cooperation of the RTL than you've been aware of, and you've just taken it for granted. Now here comes another feature (`{}`) which needs RTL support (`std::initializer_list`) and you come off all surprised!
Barry Kelly
No of course, but the thread_local storage identifier is just that, an identifier that specifies what to do. I have no problem with compiler doing background work. Your example of ... for example, the ... language feature doesn't NEED va_list to work. va_list is built around a feature of the compiler, hell you could write that yourself. std::initializer on the other hand requires code to exist to work, you could compile function( int, ... ) with no includes. And of course, how does {} need RTL support?
Dave
What would the alternative be, Dave? It would need to have a type. What would that type be? It would either be intrinsic to C++ (but of course it would still have a hidden RTL implementation, nothing comes of nothing), or it would be defined in a standard header.
Barry Kelly
Off the top of my head, an operator function: void operator{}( int array[] ); Would make more sense to me with some_class = { 1,2,3,4 } than void function( initializer_list<int> )
Dave
The issue with what I just suggested in fact is the size doesn't get passed. Though, similar to ... functions, instead of building ... into a class as is being done here, passing the length as the first element (just like variadic functions) would solve it. Also, I still do not see where RTL requirements come in here.
Dave
"You can lead a horse to water, but you can't make it drink"...
Barry Kelly
Perhaps, except in this case I believe you've linked me to undrinkable water. The feature requires no RTL support, already requires baked compiler support for binding to a user-required class so no reason not to include the {} operator.
Dave
+2  A: 

I'm not wrong in thinking that these new features do infact rely extremely heavily on compiler code

They do rely extremely on the compiler. Whether you need to include a header or not, the fact is that in both cases, the syntax would be a parsing error with today compilers. The for (:) does not quite fit into todays standard, where the only allowed construct is for(;;)

It feels as if the features are half-baked, and instead of building the entire feature into the compiler, it's being half-done by the compiler and half done in includes. What's the reason for this?

The support must be implemented in the compiler, but you are required to include a system's header for it to work. This can serve a couple of purposes, in the case of initialization lists, it brings the type (interface to the compiler support) into scope for the user so that you can have a way of using it (think how va_args are in C). In the case of the range-based for (which is just syntactic sugar) you need to bring Range into scope so that the compiler can perform it's magic. Note that the standard defines for ( for-range-declaration : expression ) statement as equivalent to ([6.5.4]/1 in the draft):

{ 
   auto && __range = ( expression ); 
   for ( auto __begin = std::Range<_RangeT>::begin(__range), 
         __end = std::Range<_RangeT>::end(__range); 
         __begin != __end; 
         ++__begin ) { 
      for-range-declaration = *__begin; 
      statement 
   } 
}

If you want to use it only on arrays and STL containers that could be implemented without the Range concept (not in the C++0x sense), but if you want to extend the syntax into user defined classes (your own containers) the compiler can easily depend upon the existing Range template (with your own possible specialization). The mechanism of depending upon a template being defined is equivalent to requiring a static interface on the container.

Most other languages have gone in the direction of requiring a regular interface (say Container,...) and using runtime polymorphism on that. If that was to be done in C++, the whole STL would have to go through a major refactoring as STL containers do not share a common base or interface, and they are not prepared to be used polimorphically.

If any, the current standard will not be underbaked by the time it goes out.

David Rodríguez - dribeas
I can understand the requirement for the for( : ) construct, without the compiler requiring a different container iteration would be impossible. But the for( : ) doesn't demand a type to work, only if you intend to extend it. Which to me feels similar to the placement new overloading mechanism. I still do not see the reason for the initializer_type list however. = { 1,2,3,4,5 } to an array literal, passed to a void operator{}( int arr[], 5 ); seems cleaner, and requires no include systems. If the user wants to then put it in a container, they can do it in that function.
Dave
Just as you mentioned va_args, the functions and macros are built around a compiler feature. You will not get a compiler error for not including va_args, you could write your own. Similarly C++0x [](){} lambdas do not require std::function, you could yet again write your own. {} on the other hand demands the existence of a specific class name, which will crash if it doesn't exist. And shouldn't be forced, I guess I'm just `nerd raging` but a lot of what I'm seeing in the C++0x standard seems more and more alien by the minute.
Dave
It is a little more complicated than that. The compiler and the STL implementors don't need to be the same. Microsoft uses Dinkunware STL implementation, but you can use STLPort or any other. STL classes are not 100% defined and nailed down, and an STL implementor can decide to add new template parameters as long as they have a default value and thus can be used in user code that depends and uses only the ones in the standard. This means that the compiler cannot really match easily the STL containers by itself.
David Rodríguez - dribeas
It is the implementor of the STL the one that provides the extra information that the compiler can possibly require, and at that point it also allows you to provide that extra 'out of band' information on containers. The good-old-stl already uses traits for the algorithms, and that is a common idiom: providing extra information in external templates. The current standard is just following the trends in the former standard. Why would you like those features removed from where they are, and if the Range is present, why do you not want the implementor to use it for the STL?
David Rodríguez - dribeas
It is not the STL using things the compiler provides that I'm finding strange, traits and the variadic elipsis for example, are things the STL build around. What I'm finding odd is the fact that unlike the STL relying on compiler features, the compiler is now relying on STL features. There are no compiler features I know of at the moment that actually rely on the existence of code. At least not code that the user must provide manually, the for(:) loop is an exception I can understand, but unlike {} it's not forcing the existence of code.
Dave
@Dave va_list? type_info? I believe these are both 'compiler features' that require user code (and includes).
KitsuneYMG
@kts type_info is exactly the same kind of implementation as std::initializer list. va_list is not however, it's all user-defined code that doesn't need to exist for ... to compile, it just wraps it. (...) doesn't require user code to function correctly. type_info and std::initializer do. type_info though, there really is no other option, similar to variadic template parameter packs, there's no other solution other than to build it in.Barry was right, I'm being a little sheltered from the guts of the compiler, but I still see no reason {} should REQUIRE code to exist to compile.
Dave
There are different options there, taken by different languages. In Java, for example, `void f( Type... args )` maps directly to `void f( Type args[] )` and the call is just syntactic sugar: the compiler will instantiate an array for you at the place of call. That solution would require a deep change in the core language. As it is, arrays are never passed by value, but rather decay into pointers to the first element (unless you pass by reference, and at that point, an array of N elements is a different type than an array of N+1 elements.
David Rodríguez - dribeas
That means that if they had chosen that approach, with reference semantics code would have to be templated in the size of the array, or with value semantics the compiler would have to provide you with both a pointer and a size (or two pointers). That could be done by forcing `void f( Type... )` to translate into `void f( int, Type* )`, or else you add a new type that encapsulate the size and pointer, which is what they actually did, adding a second pointer that is offsetted by N, and allowing the implementor to provide non-pointer implementations if they so wished.
David Rodríguez - dribeas
As with the rest of the standard libraries, only interfaces are exposed, and that allows the implementor to add new functionality, as bounds checking (iterators into STL containers in some debug implementations keep track of the state of the container and will flag in debug builds illegal uses of the iterator, as for example, dereferencing a stale iterator into a vector after an element was added into it, helping detect hard to debug problems).
David Rodríguez - dribeas
The second approach you mention, `void f( int, Type* )` is exactly what I was suggesting with `T operator{}( T array[], int );`. No need for a new type, no need for templating, no need for unnecessary user defined code (and is no less baked into the compiler than searching for an existing class). While I understand what you're saying about Type... args being syntactic sugar that would need deep compiler changes. I don't see how { 1, 2, 3, 4, 5 } into int *, 5 is any more complex (or require more changes) than searching for a class, and binding an array around it's accessors (Updated question)
Dave
So you would rather break the single argument initializer list into two arguments than define a new type in a header? How will you differentiate that signature from another method that takes an integer and a pointer?
David Rodríguez - dribeas
Ah yeah I see your point, I guess I hate the idea of compiler requiring code so much that I couldn't see the benefits. `operator {}` might've solved the signature issue, but initializer_lists can be used in other places I didn't think about. I can't see any other way around it other than a using initializer_list, but it still doesn't appeal to me.
Dave
On second thoughts, passing {1,2,3} to a function is really nothing more than creating a way to pass array literals. You could argue that being an **initializer**_list it should be a separate type to initializer things, but that's what initializer lists currently implemented in class constructors are for, and seeing as these can be passed to ANY function, it's not exactly much of an initializer_list anymore as it is an array literal. If that's the case, why should you differentiate between `foo({1,2,3})` and `T x[] = {1,2,3}; foo(x);`? Why should you bind it to a class that just wraps arrays?
Dave
There are differences. In both C and (inherited from there) C++ you cannot pass arrays by value. They will decay into a pointer. You can pass arrays by reference, but arrays of different sizes are actually different types and that would force you template all the functions for which you want to use initializer lists. This is really taking nowhere, the implication of any change in the standard escape both my and your knowledge. Just take it or leave it, but take it for granted that each and every change to the standard has gone through a lot of discussion and scrutiny.
David Rodríguez - dribeas