views:

98

answers:

1

Looking around for python implementations of tries just so that I can understand what they are and how they work, I came across Justin Peel's patricia trie and found it very instructive: it's straightforward enough for one as new as I am to play around with it and learn from it.

However there's something I think I'm not understanding:

using Justin's class patricia() thus:

>>> p = patricia()
>>> words = ['foo','bar','baz']
>>> for x in words:
...     p.addWord(x)

I get a trie as a dictionary looking like this:

>>> p._d
{'b': ['a', {'r': ['', {}], 'z': ['', {}]}], 'f': ['oo', {}]}

addWord() and isWord() work as expected, but isPrefix() shows the following behavior which puzzles me:

>>> p.isPrefix('b')
True
>>> p.isPrefix('f')
True
>>> p.isPrefix('e')
False

good, as expected; and then

>>> p.isPrefix('ba')
True

also good, but then:

>>> p.isPrefix('bal')
True

and furthermore:

>>> p.isPrefix('ballance')
True
>>> p.isPrefix('ballancing act')
True

Something here I'm not understanding?

+2  A: 

I believe the bug is in the following snippet of the code you're looking at:

       if w.startswith(node[0][:wlen-i],i):
            if wlen - i > len(node[0]):
                i += len(node[0])
                d = node[1]
            return True

it should actually be:

       if w.startswith(node[0][:wlen-i],i):
            if wlen - i > len(node[0]):
                i += len(node[0])
                d = node[1]
            else:
                return True
Alex Martelli
You're quite right, Alex. This looks like precisely the bug. I did warn that it wasn't tested very well. It was just something I scratched up and haven't really used since then. I fixed the code in my original posted answer.
Justin Peel
@Justin, great, thanks for confirming this!
Alex Martelli
Alex, Thanks very much for finding that, now it seems to work as expected. And thanks Justin for building this. Having access to the root as a dictionary I found very instructive: I think in outlines, indented text (this is the result of training in the humanities, not experience with python; however, this was exactly what drew me to python in the first place), so being able to prettyprint the trie really helped illustrate what was going on. I hope you'll post any further work you do on this: benchmark testing or whatever. I'd also love to hear any ideas on potential uses for Patricia tries.
jjon
@jjon, you're welcome! BTW, a nice article on Radix Trees in general (with some mention of applications) is at http://en.wikipedia.org/wiki/Radix_tree
Alex Martelli