I get the concept behind a trie. But I get a little confused when it comes to implementation.
The most obvious way I could think to structure a Trie
type would be to have a Trie
maintain an internal Dictionary<char, Trie>
. I have in fact written one this way, and it works, but... this seems like overkill. My impression is that a trie should be lightweight, and having a separate Dictionary<char, Trie>
for every node does not seem very lightweight to me.
Is there a more appropriate way to implement this structure that I'm missing?
UPDATE: OK! Based on the very helpful input from Jon and leppie, this is what I've come up with so far:
(1) I have the Trie
type, which has a private _nodes
member of type Trie.INodeCollection
.
(2) The Trie.INodeCollection
interface has the following members:
interface INodeCollection
{
bool TryGetNode(char key, out Trie node);
INodeCollection Add(char key, Trie node);
IEnumerable<Trie> GetNodes();
}
(3) There are three implementations of this interface:
class SingleNode : INodeCollection
{
internal readonly char _key;
internal readonly Trie _trie;
public SingleNode(char key, Trie trie)
{ /*...*/ }
// Add returns a SmallNodeCollection.
}
class SmallNodeCollection : INodeCollection
{
const int MaximumSize = 8; // ?
internal readonly List<KeyValuePair<char, Trie>> _nodes;
public SmallNodeCollection(SingleNode node, char key, Trie trie)
{ /*...*/ }
// Add adds to the list and returns the current instance until MaximumSize,
// after which point it returns a LargeNodeCollection.
}
class LargeNodeCollection : INodeCollection
{
private readonly Dictionary<char, Trie> _nodes;
public LargeNodeCollection(SmallNodeCollection nodes, char key, Trie trie)
{ /*...*/ }
// Add adds to the dictionary and returns the current instance.
}
(4) When a Trie
is first constructed, its _nodes
member is null
. The first call to Add
creates a SingleNode
, and subsequent calls to Add
go from there, according to the steps described above.
Does this make sense? This feels like an improvement in the sense that it somewhat reduces the "bulkiness" of a Trie
(nodes are no longer full-blown Dictionary<char, Trie>
objects until they have a sufficient number of children). However, it has also become significantly more complex. Is it too convoluted? Have I taken a complicated route to achieve something that should've been straightforward?