I don't know if this is the place to ask about algorithms. But let's see if I get any answers ... :)
If anything is unclear I'm very happy to clarify things.
I just implemented a Trie in python. However, one bit seemed to be more complicated than it ought to (as someone who loves simplicity). Perhaps someone has had a similar problem?
My aim was to minimize the number of nodes by storing the largest common prefix of a sub-trie in its root. For example, if we had the words stackoverflow, stackbase and stackbased, then the tree would look something like this:
[s]tack
[o]verflow ______/ \_______ [b]ase
\___ [d]
Note that one can still think of the edges having one character (the first one of the child node).
Find-query is simple to implement. Insertion is not hard, but somewhat more complex than I want.. :(
My idea was to insert the keys one after the other (starting from an empty trie), by first searching for the to-be-inserted key k (Find(k)), and then rearranging/splitting the nodes locally at the place where the find-procedure stops. There turn out to be 4 cases: (Let k be the key we want to insert, and k' be the key of the node, where the search ended)
- k is identical to k'
- k is a "proper" prefix of k'
- k' is a "proper" prefix of k
- k and k' share some common prefix, but none of the cases (1), (2) or (3) occur.
It seems that each of the cases are unique and thus imply different modifications of the Trie. BUT: is it really that complex? Am I missing something? Is there a better approach?
Thanks :)