views:

394

answers:

2

We need to produce a fairly complex dynamic query builder for retrieving reports on the fly. We're scratching our heads a little on what sort of data structure would be best.

It's really nothing more than holding a list of selectParts, a list of fromParts, a list of where criteria, order by, group by, that sort of thing, for persistence. When we start thinking about joins, especially outer joins, having clauses, and aggregate functions, things start getting a little fuzzy.

We're building it up interfaces first for now and trying to think as far ahead as we can, but definitely will go through a series of refactorings when we discover limitations with our structures.

I'm posting this question here in the hopes that someone has already come up with something that we can base it on. Or know of some library or some such. It would be nice to get some tips or heads-up on potential issues before we dive into implementations next week.

+2  A: 

I've done something similar couple of times in the past. A couple of the bigger things spring to mind..

  • The where clause is the hardest to get right. If you divide things up into what I would call "expressions" and "predicates" it makes it easier.
  • Expressions - column references, parameters, literals, functions, aggregates (count/sum)
  • Predicates - comparisons, like, between, in, is null (predicates have expression as children, e.g. expr1 = expr2. Then you also having composites such as and/or/not.
  • The whole where clause, as you can imagine, is a tree with a predicate at the root, with maybe sub-predicates underneath eventually terminating with expressions at the leaves.
  • To construct the HQL you walk the model (depth first usually). I used a visitor as I need to walk my models for other reasons, but if you don't have multiple purposes you can build the rendering code right into the model.

e.g. If you had

"where upper(column1) = :param1 AND ( column2 is null OR column3 between :param2 and param3)"

Then the tree is

Root
- AND
  - Equal
    - Function(upper)
      - ColumnReference(column1)
    - Parameter(param1)
  - OR
    - IsNull
      - ColumnReference(column2)
    - Between
      - ColumnReference(column3)
      - Parameter(param2)
      - Parameter(param3)

Then you walk the tree depth first and merge rendered bits of HQL on the way back up. The upper function for example would expect one piece of child HQL to be rendered and it would then generate

"upper( " + childHql + " )"

and pass that up to it's parent. Something like Between expects three child HQL pieces.

  • You can then re-use the expression model in the select/group by/order by clauses

  • You can skip storing the group by if you wish by just storing the select and before query construction scan for aggregate. If there is one or more then just copy all the non-aggregate select expressions into the group by.

  • From clause is just a list of table reference + zero or more join clauses. Each join clause has a type (inner/left/right) and a table reference. Table reference is a table name + optional alias.

Plus, if you ever get into wanting to parse a query language (or anything really) then I can highly recommend ANTLR. Learning curve is quite steep but there are plenty of example grammars to look at.

HTH.

Mike Q
Thanks. That certainly helps.
aberrant80
A: 

if you need EJB-QL parser and data structures, EclipseLink (well several of it's internal classes) have good one:

JPQLParseTree tree = org.eclipse.persistence.internal.jpa.parsing.jpql.JPQLParser.buildParserFor(" _the_ejb_ql_string_ ").parse();

JPQLParseTree contains the all the data.

but generating EJB-QL back from modified JPQLParseTree is something you have to do yourself.

noone