I am trying to construct a general tree. Are there any built in data structures in Python to implement a tree?
views:
616answers:
4How can I implement a tree in Python? Are there any built in data structures in Python like in Java?
There aren't trees built in, but you can easily construct one by subclassing a Node type from List and writing the traversal methods. If you do this, I've found bisect useful.
There are also many implementations on PyPi that you can browse.
If I remember correctly, the Python standard lib doesn't include tree data structures for the same reason that the .NET base class library doesn't: locality of memory is reduced, resulting in more cache misses. On modern processors it's usually faster to just bring a large chunk of memory into the cache, and "pointer rich" data structures negate the benefit.
Python doesn't have the quite the extensive range of "built-in" data structures as Java does. However, because Python is dynamic, a general tree is easy to create. For example, a binary tree might be:
class Tree(object):
def __init__(self):
self.left = None
self.right = None
self.data = None
You can use it like this:
root = Tree()
root.data = "root"
root.left = Tree()
root.left.data = "left"
root.right = Tree()
root.right.data = "right"
What operations do you need? There is often a good solution in Python using a dict or a list with the bisect module.
There are many, many tree implementations on PyPI, and many tree types are nearly trivial to implement yourself in pure Python. However, this is rarely necessary.
I've implemented trees using nested dicts. It is quite easy to do, and it has worked for me with pretty large data sets. I've posted a sample below, and you can see more at Google code
def addBallotToTree(self, tree, ballotIndex, ballot=""):
"""Add one ballot to the tree.
The root of the tree is a dictionary that has as keys the indicies of all
continuing and winning candidates. For each candidate, the value is also
a dictionary, and the keys of that dictionary include "n" and "bi".
tree[c]["n"] is the number of ballots that rank candidate c first.
tree[c]["bi"] is a list of ballot indices where the ballots rank c first.
If candidate c is a winning candidate, then that portion of the tree is
expanded to indicate the breakdown of the subsequently ranked candidates.
In this situation, additional keys are added to the tree[c] dictionary
corresponding to subsequently ranked candidates.
tree[c]["n"] is the number of ballots that rank candidate c first.
tree[c]["bi"] is a list of ballot indices where the ballots rank c first.
tree[c][d]["n"] is the number of ballots that rank c first and d second.
tree[c][d]["bi"] is a list of the corresponding ballot indices.
Where the second ranked candidates is also a winner, then the tree is
expanded to the next level.
Losing candidates are ignored and treated as if they do not appear on the
ballots. For example, tree[c][d]["n"] is the total number of ballots
where candidate c is the first non-losing candidate, c is a winner, and
d is the next non-losing candidate. This will include the following
ballots, where x represents a losing candidate:
[c d]
[x c d]
[c x d]
[x c x x d]
During the count, the tree is dynamically updated as candidates change
their status. The parameter "tree" to this method may be the root of the
tree or may be a sub-tree.
"""
if ballot == "":
# Add the complete ballot to the tree
weight, ballot = self.b.getWeightedBallot(ballotIndex)
else:
# When ballot is not "", we are adding a truncated ballot to the tree,
# because a higher-ranked candidate is a winner.
weight = self.b.getWeight(ballotIndex)
# Get the top choice among candidates still in the running
# Note that we can't use Ballots.getTopChoiceFromWeightedBallot since
# we are looking for the top choice over a truncated ballot.
for c in ballot:
if c in self.continuing | self.winners:
break # c is the top choice so stop
else:
c = None # no candidates left on this ballot
if c is None:
# This will happen if the ballot contains only winning and losing
# candidates. The ballot index will not need to be transferred
# again so it can be thrown away.
return
# Create space if necessary.
if not tree.has_key(c):
tree[c] = {}
tree[c]["n"] = 0
tree[c]["bi"] = []
tree[c]["n"] += weight
if c in self.winners:
# Because candidate is a winner, a portion of the ballot goes to
# the next candidate. Pass on a truncated ballot so that the same
# candidate doesn't get counted twice.
i = ballot.index(c)
ballot2 = ballot[i+1:]
self.addBallotToTree(tree[c], ballotIndex, ballot2)
else:
# Candidate is in continuing so we stop here.
tree[c]["bi"].append(ballotIndex)