views:

57

answers:

1

Hi!

I'm implementing a new machine learning algorithm in Java that extracts a prototype datastructure from a set of structured datasets (tree-structure). As im developing a generic library for that purpose, i kept my design independent from concrete data-representations like XML.

My problem now is that I need a way to define a data model, which is basically a ruleset describing valid trees, against which a set of trees is being matched. I thought of using BNF or a similar dialect.

Basically I need a way to iterate through the space of all valid TreeNodes defined by the ModelTree (Like a search through the search space for algorithms like A*) so that i can compare my set of concrete trees with the model. I know that I'll have to deal with infinite spaces there but first things first.

I know, it's rather tricky (and my sentences are pretty bumpy) but I would appreciate any clues.

Thanks in advance, Stefan

A: 

I believe that you are talking about a Regular Tree Grammar. This Wikipedia page is an entry point for the topic, and the book that it links to might be helpful.

Stephen C