views:

106

answers:

5

I'd like to take a string such as this:

[One, Two[A, B[i, ii, iii, iv], C], Three]

And convert it into a hierarchy of lists, so that if I execute code such as the following:

Console.Write(myList[1][1][2]);

The output will be:

iii

I'm hoping that this is a common enough requirement that there's some simple parsing code written in C# this.

Let me know if my question could be phrased more clearly.

A: 

Are you after arrays or lists?

This would be extremely difficult to do with strings as you have to deal with spaces, or the use of a comma in an element etc.

If you have control as to what is in this list, I suggest that you look into XML or binary serialization, which have libraries to help you do this.

Spence
+4  A: 

XML or JSON are excellent methods to store things like this.

As Spence said--this is a hard problem--I don't recommend rolling your own.

Scroll down to the bottom of that JSON link for implementations in most languages.

Michael Haren
JSON sounds like a better idea the more I think about it.I've found a nice-looking JSON parser for .NET. Hopefully it does the job for what I need. http://james.newtonking.com/archive/2008/08/25/json-net-3-0-released.aspx
jonathanconway
A: 

It's not a practical answer, but if you are able to use the .NET 4.0 beta, you could look into Oslo (and subsequent tooling) which Microsoft is developing for textual DSL's, which it seems is exactly what you need.

casperOne
+2  A: 

I'd have to go with a regular expression. Substring matches and sub-expressions can give you the recursion to get the sub-sub-... levels in.

Use something like /^\[(.+)\]$/ in preg to collect a single level of items. Process it until you do not receive a level anymore, explode on ',' after you get the guts of a single series.

Split result by a ','.

should come out like

  • [One, Two[A, B[i, ii, iii, iv], C], Three]
    • One
    • Two
    • [A, B[i, ii, iii, iv], C]
      • A
      • B
      • [i, ii, iii, iv]
        • i
        • ii
        • iii
        • iv
      • C
    • Three

Finally trim off the left/right spaces to get your polished result.

Xedecimal
A: 

My vote is also for XML or JSON or another format if you have the ability to control the format. But lacking that, here's a Python implementation of the parser because I was bored.

class ExprParser(object):
current = []
list_stack = []

def __init__(self):
    pass

def parse(self,input):
    for atom in [s.strip() for s in input.split(',')]:
        self.parse_atom(atom)
    return self.current

def do_pushes(self,atom):
    """ Strip off the '[' and push new lists """
    i = 0
    while i < len(atom) and atom[i] == '[':
        self.push()
        i += 1
    return atom[i:]

def do_pops(self,atom):
    """ Pop the lists """
    i = 0
    while i < len(atom) and atom[i] == ']':
        self.pop()
        i += 1

def parse_atom(self,atom):
    push_start = atom.find('[')

    rest = self.do_pushes(atom[push_start:]) if push_start >= 0 else atom

    pop_start = rest.find(']')

    val = rest[:pop_start] if pop_start >= 0 else rest

    self.add(val)

    if pop_start >= 0:
        self.do_pops(rest[pop_start:])

def push(self):
    self.current = []
    self.list_stack.append(self.current)

def pop(self):
    done = self.list_stack.pop()
    self.current = self.list_stack[-1] if self.list_stack else done
    if self.current is not done:
        self.add(done)

def add(self,val):
    self.current.append(val)

Use like:

parser = ExprParser()
parser.parse('[One, Two[A, B[i, ii, iii, iv], C], Three]')

No error handling though for malformed input.