views:

127

answers:

2

Hello I want to make something like a meta language which gets parsed and cached to be more performant. So I need to be able to parse the meta code into objects or arrays.

Startidentifier: {

Endidentifier: }

You can navigate through objects with a dot(.) but you can also do arithmetic/logic/relational operations.

Here is an example of what the meta language looks like:

  • {mySelf.mother.job.jobName}

or nested

  • {mySelf.{myObj.{keys["ObjProps"][0]}.personAttribute.first}.size}

or with operations

  • {obj.val * (otherObj.intVal + myObj.longVal) == 1200}

or more logical

  • {obj.condition == !myObj.otherCondition}

I think most of you already understood what i want. At the moment I can do only simple operations(without nesting and with only 2 values) but nesting for getting values with dynamic property names works fine. also the text concatination works fine

e.g. "Hello {myObj.name}! How are you {myObj.type}?".

Also the possibility to make short if like (condition) ? (true-case) : (false-case) would be nice but I have no idea how to parse all that stuff. I am working with loops with some regex at the moment but it would be probably faster and even more maintainable if I had more in regex.

So could anyone give me some hints or want to help me? Maybe visit the project site to understand what I need that for: http://sourceforge.net/projects/blazeframework/

Thanks in advance!

A: 

maybe have a look at the PREG_OFFSET_CAPTURE flag!?

zolex
+1  A: 

It is non-trivial to parse a indeterminate number of matching braces using regular expressions, because in general, either you will match too much or too little.

For instance, consider Hello {myObj.name}! {mySelf.{myObj.{keys["ObjProps"][0]}.personAttribute.first}.size}? to use two examples from your input in the same string:

If you use the first regular expression that probably comes to mind \{.*\} to match braces, you will get one match: {myObj.name}! {mySelf.{myObj.{keys["ObjProps"][0]}.personAttribute.first}.size} This is because by default, regular expressions are greedy and will match as much as possible.

From there, we can try to use a non-greedy pattern \{.*?\}, which will match as little as possible between the opening and closing brace. Using the same string, this pattern will result in two matches: {myObj.name} and {mySelf.{myObj.{keys["ObjProps"][0]}. Obviously the second is not a full expression, but a non-greedy pattern will match as little as possible, and that is the smallest match that satisfies the pattern.

PCRE does allow recursive regular expressions, but you're going to end up with a very complex pattern if you go down that route.

The best solution, in my opinion, would be to construct a tokenizer (which could be powered by regex) to turn your text into an array of tokens which can then be parsed.

Daniel Vandersluis
Thanks very much! I did it now with a regex based tokenizer which uses recursion to detect parenthese. The pattern is not very complex by the way, but I have to recursive call the method to get all children, but it's fast and nice =D
Christian Beikov