views:

97

answers:

4

If I have a series of strings that have this base format:

"[id value]"//id and value are space delimited.  id will never have spaces

They can then be nested like this:

[a]
[a [b value]]
[a [b [c [value]]]

So every item can have 0 or 1 value entries.

What is the best approach to go about parsing this format? Do I just use stuff like string.Split() or string.IndexOf() or are there better methods?

+2  A: 

A little recursion and split would work, the main point is use recursion, it'll make it so much easier. Your input syntax looks kind of like LISP :)

Parsing a, split, no second part. done.
Parsing a [b value]. has second part, go to the beginning.
...

You get the idea.

dutt
A: 

Simple split should work For every id,there is one bracket [
So when you split that string you have n-brackets so n-1 id(s) where the last element contains the value.

Myra
+1  A: 

Regex is alway a nice solution.

string test = "[a [b [c [value]]]";
Regex r = new Regex("\\[(?<id>[A-Za-z]*) (?<value>.*)\\]");
var res = r.Match(test);

Then you can get the value (which is [b [c [value]] after the first iteration) and apply the same again until the match fails.

string id = res.Groups[1].Value;
string value = res.Groups[2].Value;
testalino
Regex is not always a nice solution. "Oh, I can solve this problem with regex" - now you have two problems.
Restuta
Well, what is your problem (or even two) with the solution? I think it is clearer than any split operation.
testalino
You think, but not other developers who will maintain this. Split is not good too.
Restuta
If it's nested in more ways than straight top down you will get in trouble. http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454
Jonas Elfström
+1  A: 

there is nothing wrong with split and indexof methods, they exist for string parsing. Here is a sample for your case:

        string str = "[a [b [c [d value]]]]";

        while (str.Trim().Length > 0)
        {
            int start = str.LastIndexOf('[');
            int end = str.IndexOf(']');

            string s = str.Substring(start +1, end - (start+1)).Trim();
            string[] pair = s.Split(' ');// this is what you are looking for. its length will be 2 if it has a value

            str = str.Remove(start, (end + 1)- start);
        }
Ali YILDIRIM
`Split` and `IndexOf` exist for (advanced) string parsing insofar as shotguns exist for shooting yourself in the foot. ;-) But I actually like your code and it should work as long as the value doesn’t contain spaces (although it is **very** inefficient).
Konrad Rudolph