views:

616

answers:

4

I am trying to construct a general tree. Are there any built in data structures in Python to implement a tree?

+4  A: 

There aren't trees built in, but you can easily construct one by subclassing a Node type from List and writing the traversal methods. If you do this, I've found bisect useful.

There are also many implementations on PyPi that you can browse.

If I remember correctly, the Python standard lib doesn't include tree data structures for the same reason that the .NET base class library doesn't: locality of memory is reduced, resulting in more cache misses. On modern processors it's usually faster to just bring a large chunk of memory into the cache, and "pointer rich" data structures negate the benefit.

fatcat1111
FYI: The interwebs are plastered with hatred against Boost. Apparently it's supposed to be a HUGE pain to deal with, especially since support for it has been discontinued. So I would recommend staying away from that
inspectorG4dget
Thanks. I haven't had any trouble personally, but I don't want to mislead so I've removed that reference.
fatcat1111
+5  A: 

Python doesn't have the quite the extensive range of "built-in" data structures as Java does. However, because Python is dynamic, a general tree is easy to create. For example, a binary tree might be:

class Tree(object):
    def __init__(self):
        self.left = None
        self.right = None
        self.data = None

You can use it like this:

root = Tree()
root.data = "root"
root.left = Tree()
root.left.data = "left"
root.right = Tree()
root.right.data = "right"
Greg Hewgill
This doesn't really explain much about making a useful tree implementation.
Mike Graham
A: 

What operations do you need? There is often a good solution in Python using a dict or a list with the bisect module.

There are many, many tree implementations on PyPI, and many tree types are nearly trivial to implement yourself in pure Python. However, this is rarely necessary.

Mike Graham
A: 

I've implemented trees using nested dicts. It is quite easy to do, and it has worked for me with pretty large data sets. I've posted a sample below, and you can see more at Google code

  def addBallotToTree(self, tree, ballotIndex, ballot=""):
    """Add one ballot to the tree.

    The root of the tree is a dictionary that has as keys the indicies of all 
    continuing and winning candidates.  For each candidate, the value is also
    a dictionary, and the keys of that dictionary include "n" and "bi".
    tree[c]["n"] is the number of ballots that rank candidate c first.
    tree[c]["bi"] is a list of ballot indices where the ballots rank c first.

    If candidate c is a winning candidate, then that portion of the tree is
    expanded to indicate the breakdown of the subsequently ranked candidates.
    In this situation, additional keys are added to the tree[c] dictionary
    corresponding to subsequently ranked candidates.
    tree[c]["n"] is the number of ballots that rank candidate c first.
    tree[c]["bi"] is a list of ballot indices where the ballots rank c first.
    tree[c][d]["n"] is the number of ballots that rank c first and d second.
    tree[c][d]["bi"] is a list of the corresponding ballot indices.

    Where the second ranked candidates is also a winner, then the tree is 
    expanded to the next level.  

    Losing candidates are ignored and treated as if they do not appear on the 
    ballots.  For example, tree[c][d]["n"] is the total number of ballots
    where candidate c is the first non-losing candidate, c is a winner, and
    d is the next non-losing candidate.  This will include the following
    ballots, where x represents a losing candidate:
    [c d]
    [x c d]
    [c x d]
    [x c x x d]

    During the count, the tree is dynamically updated as candidates change
    their status.  The parameter "tree" to this method may be the root of the
    tree or may be a sub-tree.
    """

    if ballot == "":
      # Add the complete ballot to the tree
      weight, ballot = self.b.getWeightedBallot(ballotIndex)
    else:
      # When ballot is not "", we are adding a truncated ballot to the tree,
      # because a higher-ranked candidate is a winner.
      weight = self.b.getWeight(ballotIndex)

    # Get the top choice among candidates still in the running
    # Note that we can't use Ballots.getTopChoiceFromWeightedBallot since
    # we are looking for the top choice over a truncated ballot.
    for c in ballot:
      if c in self.continuing | self.winners:
        break # c is the top choice so stop
    else:
      c = None # no candidates left on this ballot

    if c is None:
      # This will happen if the ballot contains only winning and losing
      # candidates.  The ballot index will not need to be transferred
      # again so it can be thrown away.
      return

    # Create space if necessary.
    if not tree.has_key(c):
      tree[c] = {}
      tree[c]["n"] = 0
      tree[c]["bi"] = []

    tree[c]["n"] += weight

    if c in self.winners:
      # Because candidate is a winner, a portion of the ballot goes to
      # the next candidate.  Pass on a truncated ballot so that the same
      # candidate doesn't get counted twice.
      i = ballot.index(c)
      ballot2 = ballot[i+1:]
      self.addBallotToTree(tree[c], ballotIndex, ballot2)
    else:
      # Candidate is in continuing so we stop here.
      tree[c]["bi"].append(ballotIndex)
Jeff