ansaurus

Question

Parsing semi-structured data - can I use any classifiers?

Answer 1

+2 A:

If your data has structure, arguably you can use a grammar to describe some of that structure. (Classically you use grammars to recognize what they can, often too much, and extra-grammatical checks to prune away what the grammars cannot eliminate).

If you use a grammar that can run parallel potential parses, which eliminate parses as they become infeasible, you can parse different ordering straightforwardly. (A GLR parser can do this nicely).

Imaging you have NUMBERS describing amounts, NOUNS describing various objects, and VERBS for actions. Then a grammar that can accept varying orders of items might be:

 G = SENTENCE '.' ;
 SENTENCE = VERB NOUN NUMBER ; 
 SENTENCE = NOUN VERB NUMBER;
 VERB = 'ORDER' | 'SAW' ;
 NUMBER = '1' | '2' | '10' ;
 NOUN = 'JOE' | 'TABLE' | 'SAW' ;

This sample is extremely simple, but it will handle:

 JOE ORDERED 10.
 JOE SAW 1.
 ORDER 2 SAW.

It will also accept:

 SAW SAW 10.

You can eliminate this by adding an external constraint that actors must be people.

Ira Baxter 2010-10-29 02:23:50

ansaurus

tags:

views:

answers:

Parsing semi-structured data - can I use any classifiers?

related questions