tags:

views:

1546

answers:

14

Hello StackOverflow.

I am a Java developer trying to learn C++. I have many times read over the web (including StackOverflow) that STL is the best collections library that you can get in *any* language. (Sorry, I do not have any citations atm)

However after studying some STL, I am really failing to see what makes STL so special. Would you please shed some light on what sets STL apart from the collection libraries of other languages and make it the _best_ collection library?

Thanks in advance,

Zygrob.

+3  A: 

Well, that is somewhat of a bold claim... perhaps in C++0x when it finally gets a hash map (in the form of std::unordered_map), it can make that claim, but in its current state, well, I don't buy that.

I can tell you, though, some cool things about it, namely that it uses templates rather than inheritance to achieve its level of flexibility and generality. This has both advantages and disadvantages; a disadvantage is that lots of code gets duplicated by the compiler, and any sort of dynamic runtime typing is very hard to achieve; however, a key advantage is that it is incredibly quick. Because each template specialization is really its own separate class generated by the compiler, it can be highly optimized for that class. Additionally, many of the STL algorithms that operate on STL containers have general definitions, but have specializations for special cases that result in incredibly good performance.

Michael Aaron Safyan
Re: code duplication, some linkers can condense identical functions into a single copy (e.g. MS's /Gy option) -- so any code that is not inlined will not be duplicated. And if you specify options to optimise for size (not speed), functions will only be inlined when doing so actually produces smaller code. Also not sure what you mean by "dynamic runtime typing" being hard to achieve -- just use a container of pointers (or smart pointers to make cleanup easy).
j_random_hacker
+22  A: 

It's not so much that it's "great" or "the best collections library that you can get in *any* language", but it does have a different philosophy to many other languages.

In particular, the standard C++ library uses a generic programming paradigm, rather than an object-oriented paradigm that is common in languages like Java and C#. That is, you have a "generic" definition of what an iterator should be, and then you can implement the function for_each or sort or max_element that takes any class that implements the iterator pattern, without actually having to inherit from some base "Iterator" interface or whatever.

Dean Harding
Actually the "static duck typing" approach is applied to containers by the C# compiler also. `foreach` can enumerate over any object that has a `GetEnumerator` method, and collection initializers work on any `IEnumerable` object that *also* has an `Add` method - so it just finds method names and signatures, regardless of the type they are defined in, just like C++ templates.
Daniel Earwicker
Ha! I see an `interator` -- I wish I had $1 for every time I've typed that. Muscle memory working against us :(
spong
@spong: Edited it for you :)
missingfaktor
@Daniel: thats the point, in C++ you can write your own iterators for classes which dont have one on their own...and those iterators dont have to be inherited by a special class/interface like the returned object of GetEnumerator
smerlin
@spong: Damn, when's SO getting Intellisense to fix my errors automatically :)
Dean Harding
@Daniel Earwicker: 'int' and 'char *' don't implement GetEnumerator in C# yet STL can handle these. The Duck Typing is about the behaviour of the type, rather than pre-requisites.
JBRWilkinson
@smerlin - yes, in C++ templates allow static duck typing to be used by any library facility, in C# only the compiler is able to do it so its limited to a few examples. However, they are valid examples of the same thing.
Daniel Earwicker
@JBRWilkinson - In the same way that an iterator must allow `*` for dereferencing, a C# collection must allow `GetEnumerator`. In neither case must the object be convertible to any particular type. That's what makes both cases examples of static duck typing. The value returned by `GetEnumerator` must be of some type `IEnumerator<T>` or `IEnumerator`, and from that point on duck typing is not used, but it is used for the first step of looking for the `GetEnumerator` method. This is duck typing (as opposed to requiring the collection to support `IEnumerable<T>`.)
Daniel Earwicker
+11  A: 

STL's containers are nice, but they're not much different than you'll find in other programming languages. What makes the STL containers useful is that they mesh beautifully with algorithms. The flexibility provided by the standard algorithms is unmatched in other programming languages.

Without the algorithms, the containers are just that. Containers. Nothing special in particular.

Now if you're talking about container libraries for C++ only, it is unlikely you will find libraries as well used and tested as those provided by STL if nothing else because they are standard.

Billy ONeal
Not to mention that on any decent compiler, the STL is super-optimised.
Duracell
@Duracell: Any other library could be similarly optimized -- and in fact could even be optimized further if one is willing to give up some of the guarantees that the STL makes about it's collections.
Billy ONeal
@Billy: Yes and no. Most other libraries rely heavily on indirection, interfaces, inheritance and such runtime-enforced features. The STL can be trivially optimized, as long as you have an inlining compiler, because all the abstraction is evaluated at compile-time.
jalf
+5  A: 

This is not a direct answer, but as you're coming from Java I'd like to point this out. By comparison to Java equivalents, STL is really fast.

I did find this page, showing some performance comparisons. Generally Java people are very touchy when it comes to performance conversations, and will claim that all kinds of advances are occurring all the time. However similar advances are also occurring in C/C++ compilers.

Matt Joiner
That's a great point - can you link to any data on this?
JBRWilkinson
+16  A: 

What I love about the STL is how robust it is. It is easy to extend it. Some complain that it's small, missing many common algorithms or iterators. But this is precisely when you see how easy it is to add in the missing components you need. Not only that, but small is beautiful: you have about 60 algorithms, a handful of containers and a handful of iterators; but the functionality is in the order of the product of these. The interfaces of the containers remain small and simple.

Because it's fashion to write small, simple, modular algorithms it gets easier to spot bugs and holes in your components. Yet, at the same time, as simple as the algorithms and iterators are, they're extraordinarily robust: your algorithms will work with many existing and yet-to-be-written iterators and your iterators work with many existing and yet-to-be-written algorithms.

I also love how simple the STL is. You have containers, you have iterators and you have algorithms. That's it (I'm lying here, but this is what it takes to get comfortable with the library). You can mix different algorithms with different iterators with different containers. True, some of these have constraints that forbid them from working with others, but in general there's a lot to play with.

Niklaus Wirth said that a program is algorithms plus data-structures. That's exactly what the STL is about. If Ruby and Python are string superheros, then C++ and the STL are an algorithms-and-containers superhero.

wilhelmtell
+1 for the well written and informative answer
Robb
Actually, I think you'll find it was Wirth, not Knuth - it's the title of one of his (wirth's) books.
anon
@Neil thanks, I stand corrected. http://en.wikipedia.org/wiki/Algorithms_%2B_Data_Structures_%3D_Programs
wilhelmtell
+2  A: 

Unique because it

  • focuses on basic algorithms instead of providing ready-to-use solutions to specific application problems.
  • uses unique C++ features to implement those algorithms.

As for being best... There is a reason why the same approach wasn't (and probably won't) ever followed by any other language, including direct descendants like D.

ima
_"There is a reason why the same approach wasn't (and probably won't) ever followed by any other language, including direct descendants like D."_ -- Would you please elaborate those reasons?
missingfaktor
Yes, please....
Matt Joiner
Because the designers of D know better than to emulate OO concepts with templates? (It's easier to know what the hell is going wrong if you sort functions requires a `RandomAccessIterator` instead of `T`)
KitsuneYMG
@kts: "D was designed with template metaprogramming in mind, so naturally it is better at it than C++ is."
Billy ONeal
This site is badly suited for discussion, so I won't bring out at a long list of practical and conceptual problems of stl, easy to find at multiple places (including archives of comp.lang.c++). It's not really relevant, I was pointing that no other language chose STL approach to standard library to demonstrate that STL is not generally considered to be the best library - whatever are the reasons.
ima
@kts: Which other languages comes with the __means__ to create something like the STL? Languages in the C family which came after C++ (Java, C#, for example) usually initially restricted themselves to OO - which is but one paradigm C++ supports -, and thus couldn't come up with something like this initially. Interestingly both Java and C# invented generics later (but by then they already had other container/algorithm libraries), thus proving false the claim that OO is all you'd ever need.
sbi
+2  A: 

The standard C++ library's approach to collections via iterators has come in for some constructive criticism recently. Andrei Alexandrescu, a notable C++ expert, has recently begun working on a new version of a language called D, and describes his experiences designing collections support for it in this article.

Personally I find it frustrating that this kind of excellent work is being put into yet another programming language that overlaps hugely with existing languages, and I've told him so! :) I'd like someone of his expertise to turn their hand to producing a collections library for the so-called "modern languages" that are already in widespread use, Java and C#, that has all the capabilities he thinks are required to be world-class: the notion of a forward-iterable range is already ubiquitous, but what about reverse iteration exposed in an efficient way? What about mutable collections? What about integrating all this smoothly with Linq? etc.

Anyway, the point is: don't believe anyone who tells you that the standard C++ way is the holy grail, the best it could possibly be. It's just one way among many, and has at least one obvious drawback: the fact that in all the standard algorithms, a collection is specified by two separate iterators (begin and end) and hence is clumsy to compose operations on.

Daniel Earwicker
that "clumsy" way enables you to make algorithms working on c++ collections much more generic, so they work on partitions of collections aswell, and not only whole collections.
smerlin
Yep, it is a clumsy but powerful notation. I think Alexandrescu has gone a bit overboard by replacing iterators with ranges entirely, but he's clearly on to something. In most (but not all) cases, iterators do give us a syntax where it's far too hard to compose operations.
jalf
@smerlin - consider a class that implements the range concept by delegating to another range from which it slices out a sub-range. e.g. given a range `r`, the expression `slice(r, 3, 4)` would return an object that appeared to contain the following elements: `r[3], r[4], r[5], r[6]`. So `slice(r, 3, 4)[2] = "hi";` would be equivalent to `r[5] = "hi";` Hence ranges support partitions of collections very elegantly - more so than iterators.
Daniel Earwicker
@Daniel: i doubt that this could be implemented in a language not using reference-types and GC. You would have to wrap every element of any collection in reference-counted smart-pointers which adds overhead, no matter if you use `slice` or not.
smerlin
@smerlin: Presumably, they would work the same way iterators work now. The range is valid only if the container it refers to is valid.
Dennis Zickefoose
@smerlin - exactly as Dennis says. A range is really just two iterators. For example: http://www.boost.org/doc/libs/1_42_0/libs/range/doc/utility_class.html#sub_range
Daniel Earwicker
+12  A: 

The STL works beautifully with built-in types. A std::array<int, 5> is exactly that -- an array of 5 ints, which consumes 20 bytes on a 32 bit platform.

java.util.Arrays.asList(1, 2, 3, 4, 5), on the other hand, returns a reference to an object containing a reference to an array containing references to Integer objects containing ints. Yes, that's 3 levels of indirection, and I don't dare predict how many bytes that consumes ;)

FredOverflow
"I don't dare predict how many bytes that consumes" - you can't predict how many bytes it consumes, because for example you don't know whether the JVM is going to use a small-integer optimisation when boxing `int` The last level of indirection might create 5 `Integer` objects, or might use 5 static Integer objects which are always there.
Steve Jessop
@Steve "you can't predict how many bytes it consumes" -> I guess that's exactly why I didn't dare ;-)
FredOverflow
+13  A: 

What is so great about the STL ?

The STL is great in that it was conceived very early and yet succeeded in using C++ generic programming paradigm quite efficiently.

It separated efficiently the data structures: vector, map, ... and the algorithms to operate on them copy, transform, ... taking advantage of templates to do so.

It neatly decoupled concerns and provided generic containers with hooks of customization (Comparator and Allocator template parameters).

The result is very elegant (DRY principle) and very efficient thanks to compiler optimizations so that hand-generated algorithms for a given container are unlikely to do better.

It also means that it is easily extensible: you can create your own container with the interface you wish, as long as it exposes STL-compliant iterators you'll be able to use the STL algorithms with it!

And thanks to the use of traits, you can even apply the algorithms on C-array through plain pointers! Talk about backward compatibility!

However, it could (perhaps) have been better...

What is not so great about the STL ?

It really pisses me off that one always have to use the iterators, I'd really stand for being able to write: std::foreach(myVector, [](int x) { return x+1;}); because face it, most of the times you want to iterate over the whole of the container...

But what's worse is that because of that:

set<int> mySet = /**/;

set<int>::const_iterator it = std::find(mySet.begin(), mySet.end(), 1005); // [1]
set<int>::const_iterator it = mySet.find(1005); // [2]

[1] and [2] are carried out completely differently, resulting in [1] having O(n) complexity while [2] has O(log n) complexity! Here the problem is that the iterators abstract too much.

I don't mean that iterators are not worthy, I just mean that providing an interface exclusively in terms of iterators was a poor choice.

I much prefer myself the idea of views over containers, for example check out what has been done with Boost.MPL. With a view you manipulate your container with a (lazy) layer of transformation. It makes for very efficient structures that allows you to filter out some elements, transform others etc...

Combining views and concept checking ideas would, I think, produce a much better interface for STL algorithms (and solve this find, lower_bound, upper_bound, equal_range issue).

It would also avoid common mistakes of using ill-defined ranges of iterators and the undefined behavior that result of it...

Matthieu M.
At least in C++0x you will have for-each loop access for the iterators!
Xavier Ho
Most of your gripes are problems with the C++ language rather than with STL itself. +1 though -- I agree with the gripes.
Billy ONeal
Both views and concepts checking are available within C++ (though you might have to work to get them). Views are akin to container adapters (with delegated storage) and Boost.Concept is alive and kicking. The issue I have is with iterators, not C++ :)
Matthieu M.
The D programming language standard library (phobos) mimics the C++ standard library with one exception, iterators. D uses ranges, it sounds like the exact same things as your Views concept. I agree views/ranges would have been a better way to go about it.
caspin
Ranges are a simple case of views that filter out everything not pertaining to the current range. You can make filter-views with arbitrary complex predicates and also transform-view which returns value computed on the fly from the original range of values. Also, I don't want to discard predicates altogether: they are useful as a result of `find` algorithm for example, and thus you need some way to plug them back into the other algorithms (`Boost.Range` is one possibility). I think that having the two (iterators and views) could have been great.
Matthieu M.
wrt the issue with stl:find there is nothing stopping an implementation by providing a specialised stl:find for say std::set<T>::iterator and have it called via a tag dispatch mechanism. the standard does not get in the way of such optimisations, you're grips are related to your STL vendor, though your initial opinion about just passing the container its a valid and excellent suggestion.
Hippicoder
But how would you implement `find` using a specialization for `std::set<T>::iterator` ? Even if you know the elements are stored in order, you're still stuck in O(N) world because you don't have a RandomAccessIterator. After that you can imagine tricks and all, but you won't get O(log distance(first,last)) easily...
Matthieu M.
+6  A: 

Keep in mind that STL is actually quite old now, so other, newer libraries may have specific advantages. Given the age, its' popularity is a testament to how good the original design was.

There are four main reasons why I'd say that STL is (still) awesome:

Speed STL uses C++ templates, which means that the compiler generates code that is specifically tailored to your use of the library. For example, map will automagically generate a new class to implement a map collection of 'key' type to 'value' type. There is no runtime overhead where the library tries to work out how to efficiently store 'key' and 'value' - this is done at compile time. Due to the elegant design some operations on some types will compile down to single assembly instructions (e.g. increment integer-based iterator).

Efficiency The collections classes have a notion of 'allocators', which you can either provide yourself or use the library-provided ones which allocate only enough storage to store your data. There is no padding nor wastage. Where a built-in type can be stored more efficiently, there are specializations to handle these cases optimally, e.g. vector of bool is handled as a bitfield.

Exensibility You can use the Containers (collection classes), Algorithms and Functions provided in the STL on any type that is suitable. If your type can be compared, you can put it into a container. If it goes into a container, it can be sorted, searched, compared. If you provide a function like 'bool Predicate(MyType)', it can be filtered, etc.

Elegance Other libraries/frameworks have to implement the Sort()/Find()/Reverse() methods on each type of collection. STL implements these as separate algorithms that take iterators of whatever collection you are using and operate blindly on that collection. The algorithms don't care whether you're using a Vector, List, Deque, Stack, Bag, Map - they just work.

JBRWilkinson
Speed: Templates kill compile speed and virtual pointer lookups don't take _that_ long. This isn't 1970. Efficiency: First off vector<bool> is an abomination that needs to die. Allocators aren't about efficiency, they are about how a new `T` gets created and deleted. Extensibility: This isn't unique to the STL. Hell, I write algorithms in java that take `Iterator<T`> and/or `Predicate<? super T>`.
KitsuneYMG
Elegance: Again, this isn't unique to the STL. Any generic programming language does it the same way. OO languages (like java) provide a List type that specifies a sort function be present. It does not say that the function can't call a generic sort algorithm. In fact java List's DO NOT have a `sort` member function, they are sorted by `Collections.sort`, a static method.
KitsuneYMG
@kts: Most C++ developers don't much care about compile speed; rather; we care about runtime speed. Also, templates are only slow if you use a crappy compiler -- any reasonably recent compiler has no problem with them. Will it be slower than C at build time? Probably. That doesn't mean though that the feature should be removed.
Billy ONeal
@kts: virtual pointer lookups don't take long, no, but they prevent inlining. So it's not that vtable lookups are slow, they just prevent optimizations that could otherwise make the code *extremely* fast (identical performance to hardcoded, handrolled code)
jalf
Regarding Elegance: `std::sort` only works for a `vector` and `deque` (and other things with random access iterators). In particular code breaks if you change a `vector` for a `list`, or the other way round. See the accepted answer for a very justified criticism of the limitations of defining the whole interface in terms of iterators. The abstraction is taken to the level where you *cannot* easily change the container type without breaking code (as advertized in some tutorials).
visitor
Actually, `std::vector<booL>` would be a better candidate for a question "What's __not__ so great about the STL?"
sbi
@visitor: Good point, well made.
JBRWilkinson
A: 

The primary thing is, you can use templates to make using containers switch-in/switch-out, without having to resort to the horrendous mess that is Java's interfaces.

DeadMG
No you can't. Read `Effective STL` sometime. Each container is currently just different enough that you can't, say, swap a list for a vector. In Java you can swap a LinkedList for an ArrayList and the interface (List) is unchanged.
KitsuneYMG
Depends on to what level you're trying to use them. I was just using vector<T1> and vector<T2>.
DeadMG
+2  A: 

STL gives you the pieces.

Languages and their environments are built from smaller component pieces, sometimes via programming language constructs, sometimes via cut-and-paste. Some languages give you a sealed box - Java's collections, for instance. You can do what they allow, but woe betide you if you want to do something exotic with them.

The STL gives you the pieces that the designers used to build its more advanced functionality. Directly exposing the iterators, algorithms, etc. give you an abstract but highly flexible way of recombining core data structures and manipulations in whatever way is suitable for solving your problem. While Java's design probably hits the 90-95% mark for what you need from data structures, the STL's flexibility raises it to maybe 99%, with the iterator abstraction meaning you're not completely on your own for the remaining 1%.

When you combine that with its speed and other extensibility and customizabiltiy features (allocators, traits, etc.), you have a quite excellent package. I don't know that I'd call it the best data structures package, but certainly a very good one.

Warning: percentages totally made up.

Michael E
How is this different than any other collection library?
Billy ONeal
@Billy The composite algorithms and iterators let you build things up out of smaller pieces. Take binary search: Java gives you a binarySearch that finds a matching item. What if there are multiple matching items? C++ gives two methods, lower_bound and upper_bound, which find the lower and upper bounding iterators of a contiguous range of matching items. If you need the edge case of multiple items, Java leaves you almost hanging.
Michael E
+3  A: 

Obviously C++, C#, and Java can enter as many pissing contests as you want them to. The clue as to why the STL is at least somewhat great is that Java was initially designed and implemented without type-safe containers. Then Sun decided/realised people actually need them in a typed language, and added generics in 1.5.

You can compare the pros and cons of each, but as to which of the three languages has the "greatest" implementation of generic containers - that is solely a pissing contest. Greatest for what? In whose opinion? Each of them has the best libraries that the creators managed to come up with, subject to other constraints imposed by the languages. C++'s idea of generics doesn't work in Java, and type erasure would be sub-standard in typical C++ usage.

Steve Jessop
A: 

If you fail to see what usage the STL has, I recommend buying a book, "The C++ Programming Language" by Bjarne Stroustrup. It pretty much explains everything there is about C++ because he's the dude who created it.

Daniel
I am pretty sure The Dude must be weeping secretly for what the committee has done to his language.
Zygrob