I'm not a C++ programmer and have difficulty understanding the explanations given on websites. I don't understand containers or iterators and don't have plans to learn C++ in the near future. So in layman's terms: What is the STL and what can it do for me? How does it compare to something like the Python Standard library or glibc?
Implementations of the Standard Template Library are worthless to you if you're not a C++ programmer. However, the ideas transcend any programming language.
It's a library of data structures (e.g., maps, lists, vectors, etc.). It includes iterators to abstract traversing those data structures, and algorithms for operating on them. Other languages use these ideas as well, so if you learn them for one language they'll be familiar in others.
It's a library to make certain things easier, like string manipulation, vectors, linked lists, etc. STL would save you from having to write some of the lower level code and let you focus more on your application code.
The STL is the Standard Template Library. It is a subset of the C++ standard library.
The STL provides generic implementations of useful algorithms and containers.
The containers provide any easy method of storing data in the program and then finding, sorting and performing other computations on that data.
STL is the Standard Template Library. It is a library of functions and classes that you can use. It has many of the basic algorithms and data structures of basic computer science. If you are planning to stay in C++, you should plan on learning this or another library of resources.
The code is used by many many people around the world, so you can be comfortable that the code is well tested and well known.
To understand the STL, you're going to have to understand some aspects of C++ at least. I'll try my best to explain it. The structure is deceptively simple. Where the library shines is in how use of it can simplify many complex tasks. I'm going to stick to some very simple examples though, both because anything else will likely confuse someone who doesn't know C++, and because I don't want to write a novel. ;)
First, some history. The STL (Standard Template Library) was developed separately, and then submitted to the C++ standard committee for consideration, giving them the option of adopting it into the language. But it was not developed as part of the C++ standard, and for this reason, it is designed in a style that is very different from the rest of the C++ standard library. If I remember my ancient history, it also took the standard committee a good while to understand the STL and grow used to it. When they initially saw it, they were not too keen on it, but after a while, realized just how powerful and well-designed it was. So it was adopted into the language. This all happened back in the late 1990's, as the language was approaching ISO standardization.
At its core, the STL provides the most fundamental functionality you expect of a standard library: The ability to store sequences of data, and the ability to process these sequences.
Every other language has a Collections/Containers part of its standard library, containing implementations of dynamic arrays (known as arraylists in Java, List in C#, and vectors in C++), linked lists, dictionaries and other common data structures.
They also typically provide some mechanims for traversing these structures. (Enumerators or iterators, for example)
The STL provides the same functionality in C++, but does it in a unusually elegant way, and with some interesting abstractions.
The STL is cleanly split into three separate components:
- The containers (As described above, every language has these. Arrays, ArrayList, Dictionary, Set, LinkedList and so on. Any data structure that can store a collection of objects is a container in C++)
- The algorithms (Every language also has these in some form. Algorithms are functions for processing sequences of elements.) For now, assume that a sequence is a container. That's a bit of a simplification, but we'll get to that. Algorithms serve a wide range of purposes, from a
for_each()
function that allows you to apply a function to every element in a sequence, or the relatedtransform()
which applies a function to every element, and stores the result into a separate sequence (very much like the map operation in functional languages), or accumulate (similar to fold in functional languages), but also sorting or searching functions, and functions that allow you to copy entire sequences. - And finally, the glue that binds containers and algorithms together: Iterators. As I said above, sequences (which are what the algorithms work on) are not quite the same as containers. The elements in a container certainly constitute a sequence, but the first five elements in a container are also a sequence. Or every other element in a container is a sequence. Data read directly from a file stream can be treated as a sequence as well. Even data that is generated on the fly (say, the fibonacci sequence) can be treated as a sequence of values. Sequences do not have to map to a container, or even to data that exist in memory, although that's the most common use.
Note that these is no overlap between these three areas. A container stores (and owns) data, and produces iterators. Iterators allow you to inspect, modify and traverse the data. And algorithms operate on iterator ranges
Conceptually speaking, an iterator has two functions. It points to some data, and it can be moved around in the sequence (depending on the iterator type, different move operations may be available. Almost all iterators can move to the next element. Some can also move to the previous, and some can jump arbitrary distances backward and forward). If you're familiar with C, this is going to sound a lot like pointers, and that's no coincidence. Iterators are modeled as a generalization of pointers, and in fact, pointers are also valid iterators. All the STL algorithms work on pointers as well as "real" iterators.
What this means is that any sequence of data can be represented by a pair of iterators: The first iterator points to the first element in the sequence, and the second points one past the end of the sequence.
That allows a fairly simple syntax for traversing sequences in a loop:
std::vector<int> container;
for (iter it = container.begin(); it != container.end(); ++it)
{
// perform some operations on the iterator (it) or the element it points to (*it)
++(*it); // increment the value the iterator points to
}
Or we can apply an algorithm to the sequence:
std::sort(container.begin(), container.end());
Note that the sort function does not know or care that it is working on a vector. It is passed two iterators, and these could be of any type. They can be plain pointers into an array, linked list iterators or any other valid iterator type.
We can generalize the sorting function a bit by supplying our own comparer function (any function that takes two values and returns true if the first is strictly less than the other)
// sort in descending order, by passing in a custom comparer which uses greater than instead of less than
bool greater(int lhs, int rhs) { return lhs > rhs; }
std::sort(container.begin(), container.end(), greater);
Of course, we could sort only the first five elements of the vector as well:
std::sort(container.begin(), container.begin()+5);
The begin() and end() functions are just convenience functions to get iterators from a container. We don't have to use them directly.
Another nice trick is that streams too can be generalized into iterators. So let's read all the integers from a file, and copy them to an array (arrays are plain C types, of course, so they're not proper containers and don't have iterators. But pointers work fine)
int arr[1024];
std::ifstream file("something.txt");
std::copy(std::istream_iterator(file) // create an iterator pointing to the current position in the file stream
, std::istream_iterator() // and our "end" iterator. When we reach the end of the stream, testing the two iterators for equality will yield true, and so the operation will halt
, arr);
The unique thing about the STL is how flexible and extensible it is. It interoperates cleanly with C code (pointers are legal iterators), it can be simply and easily extended (you can write your own iterator types if you like. Most of the algorithms take custom predicates of comparers, like the one I showed above, and you can define your own containers. That is, each of the three pillars of the STL can be overridden or extended, so STL could be said to be more of a design strategy than anything. You can write STL code even though you're using your own containers, iterators and algorithms. And because each of these three pillars are cleanly separated from the others, they can be swapped out much more easily than in most other languages where these three responsibilities are mixed up and shared by the same classes. An algorithm does not know which, if any, container the sequence it's operating on are stored in. It only knows that the iterators it has been passed can be dereferenced to get access to the data itself. A container does not have to support all the standard algorithms. It simply has to be able to produce a pair of iterators, and then all the functionality comes for free.
Compare this to, say, Java, where every collection class has to implement its own search, its own sort, its own everything. In C++, we only need one implementation of find(). It takes two iterators and the value to look for, and it traverses the sequence looking for the value. And so, it works on any container type, even the ones I define myself.
Another remarkable feature of the STL is that there is literally zero performance loss in using it. C++ templates are all substituted at compile-time, yielding code that can be optimized just as aggressively as if you had hand-coded everything in C. The above sort function would lose some performance because I pass a function pointer to it as my custom comparer, which typically can not be inlined, but that can be fixed if we define it as such:
struct greater {
bool operator()(int lhs, int rhs) { return lhs > rhs; }
};
std::sort(container.begin(), container.end(), greater());
Now we no longer pass a function pointer, but an object. And member functions (such as operator()) can be inlined. So this sort function is going to be just as efficient as anything you could hand-code in C.
And again, it doesn't even have to add any complexity to the sort function. In fact, sort has precisely two overloads. One that takes a comparer function, and one that doesn't.
The sort function doesn't need to know whether it is passed a function pointer or an object. As long as the syntax "X(a, b)" is valid, where X is the value it was passed as comparer, and a, b the elements to compare, the same implementation of the sort function will work. And because my greater
object overloaded the operator(), that syntax is valid both for this object and for the function pointer we passed earlier.
STL code gets a lot of functionality for free by exploiting tricks like this. The same implementation of a function works with very different argument types because of the way C++ templates work.