views:

63

answers:

2

My Python server receives jobs that contain a list of the items to act against, rather like a search query term; an example input:

(Customer:24 OR Customer:25 OR (Group:NW NOT Customer:26))

So when a job is submitted, I have to parse this recipient pattern and resolve all those customers that match, and create the job with that input.

To complicate matters, customers can join and leave groups at any time, and the job should be updated live when this happens.

So as groups change membership, I have to notify all currently-running jobs about it (and they do their thing).

How is best to parse, apply and store (in my RDBMS) this kind of list of constraints?

  • Parsing: eval(), hand-written FSM eating characters, yacc/bison/?, or?
  • applying: how would you store these constraints, and evaluate them?
  • storing: in a database, a row per term with a evaluation-order and a NOT/AND/OR op field; or as a blob?
A: 

Consider using SQL instead of inventing yet another mini language:

(
cust.id = 24
or cust.id = 25
or (cust.id = cust_group.cust_id and cust_group.id = 'NW' and cust.id != 26)
) // or somthing similar

SQL injection worries? You'd need to parse it (not too difficult if your expressions are suitably limited) and check it for plausibility whatever language it was written in.

John Machin
but I'd be very cautious about SQL injection...
Will
+1  A: 

I suggest pyparsing (http://pyparsing.wikispaces.com/) which lets you describe a grammar neatly and gives you a tree filled with data. Then, hopefully, your syntax is close enough to SQL so that you can trivially form a "where" clause from the parsing results.

You may pickle and store the parsed tree, or the unparsed requests, or ready-made SQL clauses. This depends on how often will you fetch and reuse them, and whether you need to inspect the database by other means and see the queries. I see no point in storing the queries in a non-blob form unless you want to run interesting selects against them — and if you do, you probably need an XML database or something else that supports trees easily.

9000