views:

64

answers:

2

We work with messages that are text-based (no XML). Our goal is to validate the messages, a message is valid if the content is correct. We developed our own language defined in XML to express rules on the message. We need to add more complex rules and we think that it’s now time to look at other alternative and use real rules engine. We support these types of rules:

  • name in a list of values or a regular expression ex {SMITH, MOORE, A*}
  • name is present in the message-
  • name is not present in the message
  • if condition then name = John else name = Jane Note that the condition is simple and does not contain any logical operators.

We need to support these types of rules:

  • if then else but the condition contains logical operators
  • for ... loop :
    • For all the customers in the message we want at least one from the USA and at least one from France
    • For all the customers in the message we want at least five that are from the USA and are buying more than $1000 a year
    • For any customer with name John, the last name must be Doe
  • Total customers with name John < 15
  • The name of the company is equal to the name of the company in another location in the message

The rules will depend on the type of messages we process. So we were investigating several existing solutions like:

  • JESS
  • OWL (Consistency checking)
  • Schematron (by transforming the message in XML)

What would be the best alternatives considering that we develop in Java? Another thing to consider is that we should be able to do error reporting like error description, error location (line and column number).

A: 

If you're rules are static (i.e. known at compile time) you could make this with well known Java parser generator: JavaCC.

Kdeveloper
+1  A: 

It sounds to me like you're on the right track already; my suggestions are:

  1. Inspect your text-based messages directly with a parser/interpreter and apply rules over the generated objects. @Kdeveloper has suggested JavaCC for generating parser/interpreters, and I can add to this by personally vouching for ANTLRv3 which is an excellent environment for generating parser/interpreter/transformers in Java (amongst other languages). From there, you could use Jess or some other Java rules engine to validate the objects you generate. You could possibly also try encoding your rules into a parser/interpreter directly, but I'd advise against this and instead opt for separating the rules out to keep the parsing and semantic validation steps separate.
  2. Transforming your text-based messages to XML to apply Schematron is also another viable option, but you'll obviously need to parse your text messages to get them into XML anyway. For this, I'd still suggest looking at JavaCC or ANTLRv3, and perhaps populating a pre-determined object model which can be marshaled to XML (such as that which can be generated by Castor or JAXB from a W3C XML Schema). From there, you can apply Schematron over the resulting XML.
  3. I'd argue that transforming to OWL the trickiest option of your suggestions, but could be the most powerful. To start with, you'll probably want an ontology terminology (TBox) (the classes, properties, etc). to map your instance data (ABox) into. From there, consistency checking will only get you so far; many of the kinds of constraints you've outlined as wanting to capture simply can't be represented in OWL and validated using a DL-reasoner alone. However, if you couple your OWL ontology with SWRL rules (for example), you have a chance of capturing much of the types of rules you've outlined. Look at the types of rules and built-ins available in SWRL to see if this is expressive enough for you. If it is, you can employ the use of DL-Reasoners with SWRL support such as Pellet or HermiT. Note that individual implementations of OWL/SWRL reasoners such as these may implement more or less of the W3C specification, so you'll need to inspect each to determine their applicability.
sharky