Careless use of templates can cause bloat. One way to avoid that bloat is to have a thin typesafe template that wraps non-typesafe non-template code. To do this, the wrapper needs to provide some way for the non-template code to access things it knows nothing about.
For example, in a data structure, the wrapper defines the node structs. The unsafe code needs to read and write to the nodes, but must do so indirectly, through some kind of interface which is specified by the wrapper.
One way to implement this interface is to fill in a struct (defined by the unsafe code) with details such as function-pointers and constants determined by the wrapper. And one relevant kind of constant is the offset (within some structure) of a particular field. The unsafe code can use that offset (and some pointer arithmetic) to access that field directly.
This is getting problematic, though - as optimisers get more aggressive, this can lead to pointer alias issues. This is particularly the case if nodes can escape the library. For example, nodes may be extracted from a binary tree and relinked to form a linked list. Another example, annoyingly, happens when unit testing.
I currently have a container library written along these lines, and it causes none of these problems at present - but it will shortly. The reason it avoids these problems is because all the unit testing is applied to containers (not the underlying code), and because the nodes never escape the containers. That is, nodes are always accessed the same pointer-arithmetic way, so the pointer alias optimization issue never arises.
Unfortunately, I'm shortly going to need to allow nodes to be extracted from the containers, and I'm probably going to need unit tests on the underlying unsafe code as well.
Rather than deal with this specific library, I have a much simpler extract from an old binary tree library here that suffers the same problems. In VC++9 it just works. Using MinGW GCC 4.4.0, a debug build works, but a release build fails. The problem is a mixture of inlining and failure of the optimizer to spot pointer aliasses.
Just to be clear - I don't want to here "WTF - GOTO!!!" or whatever. The issue is resolving the optimization/pointer problem. Though if you can find a way to write Tree_To_List
that is properly structured and doesn't use hidden/disguised gotos to achieve it, I'm interested.
Also, a layer of template-based abstraction is missing (the template c_Bin_Tree_Tool doesn't do the whole job - c_Tool finishes the wrapping, but in a per-use way rather than in re-usable form. That's just a side-effect of extracting the relevant code.
What this code does is create an unbalanced binary tree by inserting nodes one-by-one, then balance that tree. The balancing works by converting the tree to a list (which in a way it already is), then converting the list back to a tree. The tree is dumped to stdio both before and after the balance.
bintree.h
...
inline void* Ptr_Add (void* p1, std::ptrdiff_t p2) { return (void*) (((std::ptrdiff_t) p1) + p2); }
struct c_Bin_Tree_Closure
{
typedef int (*c_Node_Comp) (void* p_Node1, void* p_Node2);
c_Node_Comp m_Node_Comp;
std::ptrdiff_t m_Left, m_Right;
};
class c_BT_Policy_Closure
{
private:
const c_Bin_Tree_Closure* m_Closure;
protected:
void** Left_Of (void* p) { return ((void**) Ptr_Add (p, m_Closure->m_Left )); }
void** Right_Of (void* p) { return ((void**) Ptr_Add (p, m_Closure->m_Right)); }
int Compare_Node (void* p_Node1, void* p_Node2) const
{
return m_Closure->m_Node_Comp (p_Node1, p_Node2);
}
public:
c_BT_Policy_Closure ()
{
m_Closure = 0;
}
void Set_Closure (const c_Bin_Tree_Closure& p_Closure)
{
m_Closure = &p_Closure;
}
};
template<class tc_Policy>
class c_Bin_Tree_Tool : public tc_Policy
{
public:
c_Bin_Tree_Tool ()
{
}
void *List_To_Tree (size_t p_Size, void* &p_List);
void Tree_To_List (void* p_Root, void* &p_First, void* &p_Last, size_t &p_Size);
void Balance (void* &p);
void Insert (void* &p_Tree, void* p_Node);
};
template<class tc_Policy>
void* c_Bin_Tree_Tool<tc_Policy>::List_To_Tree (size_t p_Size, void* &p_List)
{
if (p_Size == 0) return 0;
size_t l_Size = p_Size / 2;
void* l_Ptr = List_To_Tree (l_Size, p_List);
void* l_This = p_List;
p_List = *tc_Policy::Right_Of (l_This);
*tc_Policy::Left_Of (l_This) = l_Ptr;
l_Size = p_Size - (l_Size + 1);
*tc_Policy::Right_Of (l_This) = List_To_Tree (l_Size, p_List);
return l_This;
}
template<class tc_Policy>
void c_Bin_Tree_Tool<tc_Policy>::Tree_To_List (void* p_Root, void* &p_First, void* &p_Last, size_t &p_Size)
{
// Use left links as prev links and right links as next links
void* l_Start = 0; // first-item-in-list pointer
void* l_Prev = 0; // previous node in list
void** l_Prev_Ptr = &l_Start; // points to the next (ie right) pointer for the next node.
void* l_Pos = p_Root;
void* l_Next;
void* l_Parent = 0;
size_t l_Count = 0;
p_Last = 0;
TOP_OF_LOOP:
l_Next = *tc_Policy::Left_Of (l_Pos);
if (l_Next != 0)
{
*tc_Policy::Left_Of (l_Pos) = l_Parent; // So we can find our way back up the tree
l_Parent = l_Pos;
l_Pos = l_Next;
goto TOP_OF_LOOP;
}
AFTER_STEP_PARENT:
l_Next = *tc_Policy::Right_Of (l_Pos);
*tc_Policy::Left_Of (l_Pos) = l_Prev;
p_Last = l_Pos;
l_Prev = l_Pos;
*l_Prev_Ptr = l_Pos;
l_Prev_Ptr = tc_Policy::Right_Of (l_Pos);
l_Count++;
if (l_Next != 0)
{
l_Pos = l_Next;
goto TOP_OF_LOOP;
}
if (l_Parent != 0)
{
l_Pos = l_Parent;
l_Parent = *tc_Policy::Left_Of (l_Pos);
goto AFTER_STEP_PARENT;
}
*l_Prev_Ptr = 0;
p_First = l_Start;
p_Size = l_Count;
}
template<class tc_Policy>
void c_Bin_Tree_Tool<tc_Policy>::Balance (void* &p)
{
void *l_First, *l_Last;
size_t l_Count;
Tree_To_List (p, l_First, l_Last, l_Count);
p = List_To_Tree (l_Count, l_First);
}
template<class tc_Policy>
void c_Bin_Tree_Tool<tc_Policy>::Insert (void* &p_Tree, void* p_Node)
{
void** l_Tree = &p_Tree;
while (*l_Tree != 0)
{
int l_Compare = tc_Policy::Compare_Node (*l_Tree, p_Node);
l_Tree = ((l_Compare < 0) ? tc_Policy::Right_Of (*l_Tree) : tc_Policy::Left_Of (*l_Tree));
}
*l_Tree = p_Node;
*tc_Policy::Right_Of (p_Node) = 0;
*tc_Policy::Left_Of (p_Node) = 0;
};
test_bintree.cpp
...
#include <iostream>
#include "bintree.h"
struct c_Node
{
c_Node *m_Left, *m_Right;
int m_Data;
};
c_Node g_Node0001 = { 0, 0, 1 }; c_Node g_Node0002 = { 0, 0, 2 };
c_Node g_Node0003 = { 0, 0, 3 }; c_Node g_Node0004 = { 0, 0, 4 };
c_Node g_Node0005 = { 0, 0, 5 }; c_Node g_Node0006 = { 0, 0, 6 };
c_Node g_Node0007 = { 0, 0, 7 }; c_Node g_Node0008 = { 0, 0, 8 };
c_Node g_Node0009 = { 0, 0, 9 }; c_Node g_Node0010 = { 0, 0, 10 };
int Node_Compare (void* p1, void* p2)
{
return (((c_Node*) p1)->m_Data - ((c_Node*) p2)->m_Data);
}
c_Bin_Tree_Closure g_Closure =
{
(c_Bin_Tree_Closure::c_Node_Comp) Node_Compare,
offsetof (c_Node, m_Left ), offsetof (c_Node, m_Right)
};
class c_Tool : public c_Bin_Tree_Tool<c_BT_Policy_Closure>
{
protected:
typedef c_Bin_Tree_Tool<c_BT_Policy_Closure> c_Base;
public:
c_Tool () { Set_Closure (g_Closure); }
void Insert (c_Node* &p_Tree, c_Node* p_Node) { c_Base::Insert ((void*&) p_Tree, p_Node); }
void Balance (c_Node* &p) { c_Base::Balance ((void*&) p); }
};
void BT_Dump (size_t p_Depth, c_Node* p_Node)
{
if (p_Node != 0)
{
BT_Dump (p_Depth + 1, p_Node->m_Left);
for (size_t i = 0; i < p_Depth; i++) std::cout << " .";
std::cout << " " << p_Node->m_Data << std::endl;
BT_Dump (p_Depth + 1, p_Node->m_Right);
}
}
int main (int argc, char* argv[])
{
c_Tool l_Tool;
c_Node *l_Root = 0;
l_Tool.Insert (l_Root, &g_Node0001);
l_Tool.Insert (l_Root, &g_Node0002);
l_Tool.Insert (l_Root, &g_Node0003);
l_Tool.Insert (l_Root, &g_Node0004);
l_Tool.Insert (l_Root, &g_Node0005);
l_Tool.Insert (l_Root, &g_Node0006);
l_Tool.Insert (l_Root, &g_Node0007);
l_Tool.Insert (l_Root, &g_Node0008);
l_Tool.Insert (l_Root, &g_Node0009);
l_Tool.Insert (l_Root, &g_Node0010);
BT_Dump (0, l_Root); std::cout << "----------" << std::endl;
l_Tool.Balance (l_Root);
BT_Dump (0, l_Root);
return 0;
}
The expected results are...
1
. 2
. . 3
. . . 4
. . . . 5
. . . . . 6
. . . . . . 7
. . . . . . . 8
. . . . . . . . 9
. . . . . . . . . 10
----------
. . . 1
. . 2
. 3
. . . 4
. . 5
6
. . . 7
. . 8
. 9
. . 10
What actually happens (MinGW GCC 4.4.0, optimized release build)...
1
. 2
. . 3
. . . 4
. . . . 5
. . . . . 6
. . . . . . 7
. . . . . . . 8
. . . . . . . . 9
. . . . . . . . . 10
----------
1
As far as I can tell, the Balance operation runs correctly, but the BT_Dump function can't see all the changes to the m_Left
and m_Right
fields.
EDIT That's wrong - otherwise why am I seeing node 1 as the new root. I guess that's what happens when you rely too much on memory of an investigation that was done months ago.
EDIT Actually, node 1 as root is an issue, but since it was the old root - well, best to just ignore this what-the-problem-is stuff and work out your own theories ;-)
There are a number of issues, in terms of being standards-undefined, in the code. I think the biggest problem is that the links in the node struct are c_Node*, but since the unsafe code knows nothing about c_Node it accesses them (via pointer arithmetic) as void*.
One fix would be for the unsafe code to do all reads and writes through getter and setter function pointers, avoiding all the pointer arithmetic and ensuring that all access to c_Node instances is done through c_Node* pointers. Even better - the interface becomes a class with getter/setter methods etc. In the complete binary tree library, I have alternate policy classes that do this, though to be honest my real fix when the problem arose was to throw all the code in a "junk" folder on the basis that I rarely use it, and should probably be using boost intrusive lists anyway.
However, this still leaves the other much more complex and heavily used container library, which is not going away. I think I'm just going to have to do the very painful refactoring to get rid of the offsetof and pointer arithmetic stuff, but...
What exactly are the C++ rules - when precisely is the compiler allowed to fail to spot a possible pointer alias? And could the binary tree code above be rewritten so that it still uses pointer arithmetic, still allows the nodes to be accessed/modified both inside and outside the library, and yet is safe from this optimization issue?