views:

109

answers:

3
+1  A: 

Keep track of the current "depth" you're parsing at.

  • If the depth of the next line is more than the current depth, recursively call the parser with the new depth, then add the result from that call to the current list.
  • If the depth of the next line is equal to the current depth, add it to the current list.
  • If the depth of the next line is less than the current depth, return the current list.
Amber
+5  A: 

In the view of search algorithm, the bullet you give is actually a sequence generated by Depth-First-Search. So my strategy is just to rebuild the tree structure with the dfs-sequence.

Following is the python code:

from collections import deque
def dfsBullet(bullet,depth):
    """
       parse the subtree with depth and the startnode of bullet[0]
    """
    li = []
    if depth != 0:
            item = bullet.popleft()
            li.append(item.split(' ',1)[1])
    while (len(bullet) != 0):
            item = bullet[0]
            #apply same algo to the child node
            if len(item.split(' ',1)[0]) > depth:
                    sublist = dfsBullet(bullet, len(item.split(' ')[0]))
            #we have traverse all childnode, so go back 
            else:
                    return li
            #add child tree to the list
            li.append(sublist)
    return li
xiao
+3  A: 

I can't parse your desired result -- it seems to have more open parentheses than corresponding closed ones and I don't understand the logic behind it.

To make a tree structure explicit, what about, e.g.:

data = '''* list item 1
* list item 2
** list item 3
** list item 4
*** list item 5
* list item 6'''.splitlines()

class Node(object):
  def __init__(self, payload):
    self.payload = payload
    self.children = []
  def show(self, indent):
    print ' '*indent, self.payload
    for c in self.children:
      c.show(indent+2)

def makenest(linelist):
  rootnode = Node(None)
  stack = [(rootnode, 0)]
  for line in linelist:
    for i, c in enumerate(line):
      if c != '*': break
    stars, payload = line[:i], line[i:].strip()
    curlev = len(stars)
    curnod = Node(payload)
    while True:
      parent, level = stack[-1]
      if curlev > level: break
      del stack[-1]
    # a child node of the current top-of-stack
    parent.children.append(curnod)
    stack.append((curnod, curlev))
  rootnode.show(0)

makenest(data)

The show method of course exists just for the purpose of verifying that the part about parsing the strings and creating the tree has worked correctly. If you can specify more precisely exactly how it is that you want to transform your tree into nested tuples and lists, I'm sure it will be easy to add to class Node the appropriate (and probably recursive) method -- so, could you please give this missing specification...?

Edit: since the OP has clarified now, it does, as predicted, become easy to satisfy the spec. Just add to class Node the following method:

  def emit(self):
    if self.children:
      return (self.payload,
              [c.emit() for c in self.children])
    else:
      return (self.payload,)

and change the last three lines of the code (last one of makenest, a blank one, and the module-level call to makenest) to:

  return [c.emit() for c in rootnode.children]

print(makenest(data))

(The parentheses after print are redundant but innocuous in Python 2, required in Python 3, so I put them there just in case;-).

With these tiny changes, my code runs as requested, now emitting

[('list item 1',), ('list item 2', [('list item 3',), ('list item 4', [('list item 5',)])]), ('list item 6',)]
Alex Martelli
Thanks for your answer, Alex. As requested, I've edited my question with a more correct example of what I'm trying to accomplish. Let me know if that clears things up.
ddbeck
Thanks for this response. This is the method I ended up going with and it's working out great. Thank you!
ddbeck