If one implements an array class the way it is commonly implemented, its performance is slower compared to its STL equivalent like a vector. So what makes STL containers/algorithms fast?
The standard library uses good algorithms, like in the case of an array (std::vector) it usually doubles the storage every time you run out of space, while a naïve implementation might increment storage by one element every time. Because increasing the amount of storage is very slow (all existing data needs to be copied from the old allocation to the new allocation), this can make a huge difference.
Similarly, all the other algorithms are implemented in rather optimal ways. The standard library usually doesn't use any kind of loop unrolling or other such optimizations on source code level. It is just regular good and simple code (with horrible variable names and a lot of templates) that the compiler then optimizes.
What I said is not specified by the C++ standard, but is what the common practice in existing implementations seems to be.
The algorithms in the STL have been studied for years by all levels of mathematicians and computer scientists, and they typically use the absolute most efficient algorithms, which your implementation may not be using. The common implementation is one which is probably not fastest, but is easiest to understand; the STL is probably using a more optimized implementation.
Most people's array classes increase the size by a constant increment rather than a constant factor (as the standard library requires). This means inserting an element takes roughly linear time rather than the amortized constant time for the standard library.
STL algorithms like for_each
take function objects that can be easily inlined. C, on the other hand, uses function pointers which are much more difficult for the compiler to optimize.
This makes a big difference in some algorithms like sorting in which the comparer function must be called many times.
Wikipedia has some more information in case you are interested.
EDIT:
As for the STL's vector class, I don't think it's necessarily faster that what you could find in, say, glibc.
the code is written in a compiler-friendly way, e.g. inlining, etc.
of course, the algorithms they use are state-of-the-art.
In addition to good, general algorithms (as others have noted), the STL is also quite efficient because of the heavy use of templates.
Template metaprograms have the nice feature that the compiler aggressively optimizes them. Some compilers are very good about this and reduce templates to the smallest, most efficient, required code for a given task. And in general, that means that functions are inlined whenever possible, and code to interact with a particular type is reduced to its simplest form. Most implementations of the STL (and most of Boost), of course, takes advantage of this to the fullest.