Does using STL increase footprint significantly? Could you guys share your experience regarding this matter? What are the best practices to build a small footprint library?
There's no one answer since STL is a set of templates. Templates, by their very nature, are only compiled in when used. So you can include all of STL and if none of it is actually used, the footprint added by STL will be zero. If you have a very small app that manages to use a lot of different templates with different specializations, the footprint can be large.
On an embedded project with a 64kb limit I once did, I couldn't even link the standard C libraries. So it depends on what you need to do
Using the STL will increase your binary size for two basic reasons:
Inlining
By default, the compiler treats template code as inline. Therefore, if you use std::list<int>
in several different compilation units, and the compiler inlines that code, they'll each have their own local inline definitions of std::list<int>
functions (which probably isn't a big deal, since the compiler will only inline very small definitions by default).
Note that (as Martin points out elsewhere) multiply-defined symbols are stripped out by all modern C++ linkers on the big platforms, as described in the GCC documentation. Thus, if the compiler leaves template code out-of-line, the linker will remove duplicates.
Specialization
Because of of the very nature of C++ templates, std::list<int>
is potentially arbitrarily different than std::list<double>
. In fact, the standard mandates that std::vector<bool>
is defined as a bit-vector, so most of the operations there are completely different than the default std::vector<T>
.
Only the library maintainer can deal with this. One solution is to take the core functionality and "un-templatize" it. Turn it into a C-style data structure with void*
everywhere. Then, the template interface that downstream developers see is a thin wrapper. The reduces the amount of duplicated code because the template specializations all share a common basis.
I assume you mean runtime memory footprint, and thusly STL containers.
STL containers are efficient for what they are...general purpose containers. If you are deciding between writing your own doubly-linked list or using std::list, please...use STL. If you are considering writing very domain-specific, bit-packed containers for each of your specific needs, use STL first and then choose your battles once all your code is working correctly.
Some good practices:
- If your library is going to expose these containers via the API, you may have to choose between putting your STL code in library headers or not using STL. The problem is that my compiler doesn't have to implement the STL the same way yours did.
- Read up on how and when STL containers allocate memory. When you can visualize how a deque grows and shrinks compared to a vector, you will be better prepared to decide which to use.
- If you need to micro-manage every byte, consider writing custom allocators. This is rarely necessary outside of embedded systems.
The thing about the STL because it is all templates it only adds to the size when you actually use. If you don't use a method then that method is not instantiated.
But there will always be a cost for the things you use.
But the real question you have to ask. Is the size going to be bigger or smaller than yourt own implementation? If you don't use the STL what do you use? You can hand write your own but that has its own cost. It is not zero space, it will not be well tested and you will not be following the established best practices.
So in reality no it does not bloat your code.
Because to add the equivalent functionality by some other method will add just as much code it just will not be as well tested.
Other post states that template code must be in-lined or have multiple definitions.
This is absolutely WRONG.
The methods are marked as in-line. But it is up to the compiler if the method is actually in-lined or not. The compiler inlining is sophisticated enough to only in-line if this will help in the optimization stratergy being used.
If not in-lined then a copy of the method will be generated in every compilation unit that uses the method. But a REQUIREMENT for a C++ linker is that it must remove ALL but ONE copy of these methods when the application is linked into an executable. If the compiler did not remove the extra copies it would have to generate a multiple definition linker error.
This is easy to show:
The following illustrates that:
- The method _M_insert_aux() is not in-lined.
- It is placed in both compilation units.
- That only one copy of the method is in the final executable.
a.cpp
#include <vector>
void a(std::vector<int>& l)
{
l.push_back(1);
l.at(0) = 2;
}
b.cpp
#include <vector>
void b(std::vector<int>& l)
{
l.push_back(1);
l.at(0) = 2;
}
main.cpp
#include <vector>
void a(std::vector<int>&);
void b(std::vector<int>&);
int main()
{
std::vector<int> x;
a(x);
b(x);
}
Checking
>g++ -c a.cpp
>g++ -c b.cpp
>nm a.o
<removed other stuff>
000000a0 S __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
<removed other stuff>
>nm b.o
<removed other stuff>
000000a0 S __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
<removed other stuff>
>c++filt __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
std::vector<int, std::allocator<int> >::_M_insert_aux(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, int const&)
>g++ a.o b.o main.cpp
nm a.out | grep __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
00001700 T __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
Ignoring the STL discussion for the moment, there is one non-obvious important practice for building a low-space-footprint static library. Partition your library into as many disparate compilation units as you can. For example, if you look at libpthread.a
, you'll see that every single function has its own compilation unit. Many linkers will throw out dead code based on whole compilation units, but not any more fine-grained than that. If I only use a few functions from the pthreads library, my linker will bring in only those definitions, and nothing else. If, on the other hand, the entire library were compiled into a single object file, my linker would have to bring in the entire library as a single "unit".
This depends on the toolchain you're using, and it only applies if you're building static libraries. But I have seen it make a very measurable difference for large libraries built sub-optimally.
While it does not address STL templates, the documentation for GCC has short section on minimizing code bloat when using templates templates.
The link is http://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/Template-Instantiation.html#Template-Instantiation