views:

197

answers:

3

I got bored during the holiday season this year and randomly decided to write a simple list comprehension/filtering library for Java (I know there are some great ones out there, I just wanted to write it my self for the hell of it).

For this list:

LinkedList<Person> list = new LinkedList<Person>();
            list.add(new Person("Jack", 20));
            list.add(new Person("Liz", 58));
            list.add(new Person("Bob", 33));

Syntax is:

Iterable<Person> filtered = Query.from(list).where(
    Condition.ensure("Age", Op.GreaterEqual, 21)
    .and(Condition.ensure("Age", Op.LessEqual, 50));

I know its ugly, but if I use static imports and use shorter method names it becomes pretty concise.

The following syntax is the ultimate goal:

Iterable<Person> list2 = Query.from(list).where("x=> x.Age >= 21 & x.Age <= 50");

Apparently expression parsing is not my strongest area, im having trouble with parsing nested/multiple conditionals. Anyone know of some resources/literature I might find helpful?

I've only got single conditional expressions being sucessfully parsed from String lambda syntax at the moment: "x=> x.Name == Jack". My underlying Expression structure is fairly solid and can easily handle any amount of nesting, the issue is just the expression parsing from a string.

Thanks

Just for kicks, here is a little insight as to how the expression structure behind the scenes can work (obviously I could have specified 'op.GreaterEqual', etc... in the following example, but I wanted to demonstrate how it is flexible to any amount of nesting):

Condition minAge1 = Condition.ensure("Age", Op.Equal, 20);
Condition minAge2 = Condition.ensure("Age", Op.Greater, 20);
Expression minAge = new Expression(minAge1, Express.Or, minAge2);
Expression maxAge = Condition.ensure("Age", Op.Equal, 50).or(Condition.ensure("Age", Op.Less, 50));
Expression ageExpression = new Expression(minAge, Express.And, maxAge);

Condition randomException = Condition.ensure("Name", Op.Equal, "Liz");
Expression expressionFinal = new Expression(ageExpression, Express.Or, randomException);
+5  A: 

Basically what you'll want is a recursive descent parser for expressions. It's a topic featured heavily in compiler theory so any book on compilers will cover the topic. In formal grammar terms it will look something like this:

condition  : orAtom ('||' orAtom)+ ;
orAtom     : atom ('&&' atom)+ ;
atom       : '(' condition ')'
           | expression ;
expression : value OPER value ;
value      : VARIABLE | LITERAL '
VARIABLE   : (LETTER | '_') (LETTER | DIGIT | '_')* ;
LITERAL    : NUMBER
           | STRING ;
NUMBER     : '-'? DIGIT+ ('.' DIGIT+)? ;
STRING     : '"' . CHAR* . '"' '
CHAR       : ('\\' | '\"' | .) + ;
LETTER     : 'a'..'z' | 'A'..'Z' ;
DIGIT      : '0'..'9' ;
OPER       : '>' | '>=' | '<' | '<=' | '=' | '!=' ;

The grammar above is (mostly) in ANTLR form as that what I'm most familiar with.

Parsing boolean or arithmetic expressions is a classic introductory topic when dealing with parsing so you should be able to find plenty of literature on it. If you want to pursue ANTLR (since you're using Java) I'd highly suggest reading The Definitive ANTLR Reference: Building Domain-Specific Languages.

If all this looks like overkill and all a bit much to take in, you might be right. It's a tough topic to get started in.

One alternative you have is not to create an arbitrary string expression but instead use a fluent interface (like you're doing):

List results = from(source)
  .where(var("x").greaterThan(25), var("x").lessThan(50))
  .select("field1", "field2");

as that is stating the expression tree in code and should be easier to implement.

cletus
I figured thats the realm I would be looking into. Thanks a bunch, at least I have a starting point now.
jdc0589
I'm not saying my current implementation is finished, but it can handle anything I have been able to throw at it thus far. It needs some work on the efficiency side though. I really think the only way to reduce verbosity to the point I will be happy with is via the string implementation (which will be parsed to the exact same underlying expression structure im using now).
jdc0589
+1  A: 
rayd09
+1  A: 

Thanks for all the tips guys. I decided most of this was way more than I needed, so I ended up regexing the hell out it to get things in to manageable groups which I could easily parse in 20-30 lines of code.

Ive got the string LambdaExpression interface working almost as well as the fluent interface, just one or two small bug.

I will probably keep developing it a little just for fun, but it's obviously too inefficient to really use since its about 90% reflection based.

jdc0589