views:

72

answers:

1

Hi would it be possible to correctly split function attributes using regex ? i want an expression that splits all attributes in a comma seperated list. But the attributes themselves could be an Array or Object or something that can also contain commas :

ex :

'string1','string2',sum(1,5,8), ["item1","item2","item3"], "string3" , {a:"text",b:"text2"}

this should be split up as :

'string1'
'string2'
sum(1,5,8)
["item1","item2","item3"]
"string3"
{a:"text",b:"text2"}

so the expression should split all commas , but not commas that are surrounded by (), {} or [].

i am trying this in as3 btw here is some code that will split all the commas (which is ofcourse not what i want) :

var attr:String = "'string1','string2',sum(1,5,8), ['item1','item2','item3'], 'string3' , {a:'text',b:'text2'}";
var result:Array = attr.match(/([^,]+),/g);
trace(attr);
for(var a:int=0;a<result.length;a++){
    trace(a,result[a]);
}

here is an expression that allows nested round brackets , but not the others...

/([^,]+\([^\)]+\)|[^,]+),*/g
+1  A: 

I've created a little example how to tackle a problem like this, only tested on your input so it might contain horrible mistakes. It only takes into account the parentheses and not the (curly) braces, but those can be easily added.

Basic idea is that you iterate over the characters in the input and add them to the current token if they are not a separator char, and push the current token into the result array when encountering a separator. You have to add a stack that will keep track how 'deep' you are nested to determine of a comma is a separator or part of a token.

For any issue more complicated than this you'll probably be better of using a 'real' parser (and probably a parser-generator), but in this case I think you'll be ok using some custom code.

As you can see parsing code like this quickly becomes quite hard to understand/debug. In a real-case scenario I'd recommend adding more comments, but also a good batch of tests to explain your expected behavior.

package {
    import flash.display.Sprite;

    public class parser extends Sprite
    {
     public function parser()
     {
      var input:String = "'string1','string2',sum(1,5,8), [\"item1\",\"item2\",\"item3\"], \"string3\" , {a:\"text\",b:\"text2\"}"


      var result:Array = parseInput(input);
      for each (var item:String in result)
      {
       trace(item);
      }
     }

     // this function only takes into account the '(' and ')' - adding the others is similar.
     private function parseInput(input:String):Array
     {
      var result:Array = [];
      trace("parsing: " + input);

      var token:String = "";
      var parenthesesStack:Array = [];
      var currentChar:String;
      for (var i:int = 0; i < input.length; i++)
      {
       currentChar = input.charAt(i)
       switch (currentChar)
       {
        case "(":
        parenthesesStack.push("(");
        break;

        case ")":
        if (parenthesesStack.pop() != "(")
        {
         throw new Error("Parse error at index " + i);
        }
        break;

        case ",":
        if (parenthesesStack.length == 0)
        {
         result.push(token);
         token = "";
        }
        break;
       }
                            // add character to the token if it is not a separating comma
       if (currentChar != "," || parenthesesStack.length != 0)
       {
        token = token + currentChar;
       }
      }
      // add the last token
      if (token != "")
      {
       result.push(token);
      }

      return result;
     }
    }
}
Simon Groenewolt
very nice , i addjusted you code a bit to work with all kinds of brackets , and also supports newsted single and double quotes now , sweet ;)i've been able to parse thisvar input:String = "'string1','string2,\"'string2\"',string2',\"string2,'string2',string2\",sum(1,5,8), [\"item1\",\"item2\",\"item3\"], \"string3\" , {a:\"text\",b:\"text2\"}"which is enough for now :)
Aaike