views:

156

answers:

5

If I'd like to know how a function written in like standard C++ library work (not just the MSDN description). I mean how does it allocate, manage, deallocate memory and return you the result. where or what do you need to know to understand that?

+2  A: 

You can look at the library headers. A lot of functionality is actually implemented there because the library is highly templatized (and templates generally need to be implemented in headers). The location of the headers depends on the compiler, but you should be able to find them quite easily (e.g. search for a file named algorithm).

You may also ask the compiler to preprocess your code to see all the related headers (this will produce extremely long output). With GCC you can do this by g++ -E yoursource.cc.

If what you are looking for isn't implemented in headers, you need the library sources, which are generally not installed by default and which are not even available for commercial compilers such as MSVC. Look for glibc (C library) and libstdc++ (C++ library), which are the ones used by GCC and some other compilers.

In any case, notice that the standard library implementations tend to be rather cryptic due to a lot of underscores being used in variable names and such (to avoid name collisions with user's macros), and often they are also infested with #ifdefs and other preprocessor cruft.

Tronic
+2  A: 

You need to know the techniques used to write C++ libraries. Getting Bjarne Stroustrup's book is a good start. Also, SGI has very detailed documentation on the STL at a suitably high level of abstraction.

If you are going to be investigating the windows based stuff you might want to study the systems part of the windows library.

To complement windows: understanding the Posix specification is also important.

Hassan Syed
A: 

I haven't this book, but according to its description, http://www.amazon.com/C-Standard-Template-Library/dp/0134376331 includes

-Practical techniques for using and implementing the component

Isn't this what you want?

AProgrammer
A: 

First a few basic data-structure principles, then a note and some links about allocators...

The STL containers use a number of different data structures. The map, set, multimap and multiset are normally implemented as binary trees with red-black balancing rules, for example, and deque is possibly (more impression than knowledge) a circular queue in an array, exploiting an array-doubling or similar growth pattern.

None of the data structures are actually defined by the standard - but the specified performance characteristics limit the choices significantly.

Normally, your contained data is contained directly in the data structure nodes, which are held (by default) in heap allocated memory. You can override the source of memory for nodes by providing an allocator template parameter when you specify the container - more on that later. If you need the container nodes to reference (not contain) your items, specify a pointer or smart-pointer type as the contained type.

For example, in an std::set, the nodes will be binary tree nodes with space in them for an int and the two child pointers, and the metadata that the library needs (e.g. the red/black flag). The binary tree node will not move around your applications address-space, so you can store pointers to your data item elsewhere if you want, but that isn't true for all containers - e.g. an insert in a vector moves all items above the insert point up by one, and may have to reallocate the whole vector, moving all items.

The container class instance is normally very small - a few pointers is typical. For example, the std::set etc usually have a root pointer, a pointer to the lowest-key node and a pointer to the highest-key node, and probably a bit more metadata.

One issue the STL faces is creating and destroying instances in multi-item nodes without creating/destroying the node. This happens in std::vector and std::deque, for instance. I don't know, strictly, how the STL does it - but the obvious approach requires placement new and explicit destructor calls.

Placement new allows you to create an object in an already-allocated piece of memory. It basically calls the constructor for you. It can take parameters, so it can call a copy constructor or other constructor, not just the default constructor.

To destruct, you literally call the destructor explicitly, via a (correctly typed) pointer.

((mytype*) (void*) x)->~mytype ();

This works if you haven't declared an explicit constructor, and even for built-in types like "int" that don't need destructing.

Likewise, to assign from one constructed instance to another, you make an explicit call to operator=.

Basically, the containers are able to create, copy and destroy data within an existing node fairly easily, and where needed, metadata tracks which items are currently constructed in the node - e.g. size() indicates which items are currently constructed in an std::vector - there may be additional non-constructed items, depending on the current capacity().

EDIT - It's possible that the STL can optimise by using (directly, or in effect) std::swap rather than operator= to move data around. This would be good where the data items are (for example) other STL containers, and thus own lots of referenced data - swapping could avoid lots of copying. I don't know if the standard requires this, or allows but doesn't mandate it. There is a well-known mechanism for doing this kind of thing, though, using a "traits" template. The default "traits" can provide an assignment-using method whereas specific overrides may support special-case types by using a swapping method. The abstraction would be a move where you don't care what is left in the source (original data, data from target, whatever) as long as it's valid and destructible.

In binary tree nodes, of course, there should be no need for this as there is only one item per node and it's always constructed.

The remaining problem is how to reserve correctly-aligned and correctly-sized space within a node struct to hold an unknown type (specified as a template parameter) without getting unwanted constructor/destructor calls when you create/destroy the node. This will get easier in C++0x, since a union will be able to hold non-POD types, giving a convenient uninitialised-space type. Until then, there's a range of tricks that more-or-less work with different degrees of portability, and no doubt a good STL implementation is a good example to learn from.

Personally, my containers use a space-for-type template class. It uses compiler-specific allocation checks to determine the alignment at compile-time and some template trickery to choose from an array-of-chars, array-of-shorts, array-of-longs etc of the correct size. The non-portable alignment-checking tricks are selected using "#if defined" etc, and the template will fail (at compile time) when someone throws a 128-bit alignment requirement at it because I haven't allowed for that yet.

How to actually allocate the nodes? Well, most (all?) STL containers take an "Allocator" parameter, which is defaulted to "allocator". That standard implementation gets memory from and releases it to the heap. Implement the right interface and it can be replaced with a custom allocator.

Doing that is something I don't like to do, and certainly not without Stroustrups "The C++ Programming Language" on my desk. There's a lot of requirements to meet in your allocator class, and at least in the past (things may have improved), compiler error messages were not helpful.

Google says you could look here, though...

Steve314
A: 

Operating system functions to allocate/free memory are not really relevant to the C++ standard library.

The standard library containers will (by default) use new and delete for memory, and that uses a compiler-specific runtime which almost certainly manages its own heap data structure. This approach is generally more appropriate for typical applications use, where the platform-specific operating system heap is usually more appropriate for allocating large blocks.

The application heap will allocate/free memory from the operating system heap, but "how?" and "when?" are platform-specific and compiler-specific details.

For the Win32 memory management APIs, look here...

http://msdn.microsoft.com/en-us/library/ms810603.aspx

I'm sure you can find win64 equivalents if needed.

Steve314