views:

1036

answers:

5

I'm looking for something that may or may not exist. Is there such a thing as an 'English grammar rules engine' available on the Windows platform (specifically for something like .NET or C#)?

Specifically, I'm looking for something that would be able to take a sentence in the 'future tense' and change it to the 'past tense' (based on English grammar rules) ... like:

Work on widget software and then meet with Ricardo

to this:

Worked on widget software and then met with Ricardo

Is there a rules engine that does that already?

+8  A: 

If you build one that is reliable you would be famous.

Jonathan Allen
+4  A: 

Talk to this guy, he might have some ideas for you. In general, English is too ambiguous for this type of thing. For example:

Cut paper in half.

Is this an imperative command, or a past-tense sentence fragment? And my personal favourite:

Time flies like an arrow; fruit flies like a banana.

Any human can parse that, but only because of a great deal of semantic knowledge.

That being said, there are some things that might be worth looking into, like SharpNLP

Eclipse
Since when was throwing Chomsky at the problem a valid solution? Hilarious!
robber.baron
+2  A: 

The short answer is no, general NLP parsing engines don't exist.

The long answer is "kinda", but there's 50 years of research showing that it's a Very Hard Problem, in the general case. There might be one doing specific tense transformations. Regardless, C# probably won't have one.

Paul Nathan
+4  A: 

As others have stated, this is a very hard problem and has not been solved in the general case. However, there are some systems that do pretty well. Princeton's WordNet is one of them. It can identify parts of speech, synonyms, etc (perhaps including tense) with some degree of accuracy. I think you may be interested in these functions, which appear to find the root of a word given a particular conjugation and may also be able to find a particular conjugation given the root (but that page doesn't provide examples, so I can't be sure I'm interpreting the docs correctly).

rmeador
+1  A: 

Check this out, it's called Grammatica. It may not be precisley what you're after but it's definitley a good start to parsing English grammar rules in C#.

James
Grammatica appears to be a standard parser generator intended for typical computer langauges. Such tools are notoriously bad for parsing English, which is highly ambiguous, often contains "partial" sentences, and whose grammar is continually argued about (indeed even twisted by hip-hop artists). Unless you can exhibit a specific Grammatica grammar for English, this answer is simply wrong.
Ira Baxter
The answer is not wrong. As I stated it's a start not a complete solution. The tool is a framework that allows a great deal of fine tuning - obviously you would need to add your own English grammar rules to it, the ones that you specifically care about.
James
People have tried using parser generators like this on English. Pretty much the judgement is this path is a failure. The closest hi-tech parser attempted AFAIK Natural Languages is GLR (not LL(k)), and that approach was abandoned in the late 80s.
Ira Baxter
What evidence do you have to back your assertion up?
James
Quick google search: http://nlp.stanford.edu/downloads/lex-parser.shtml I assume the Stanford guys can be considered smart. What they aren't doing is running LL(k) parsers. Quote from their site: "Probabilistic parsers use knowledge of language gained from hand-parsed sentences to try to produce the most likely analysis of new sentences. These statistical parsers still make some mistakes, but commonly work rather well. Their development was one of the biggest breakthroughs in natural language processing in the 1990s."
Ira Baxter
I don't see anything in the cited reference that makes a judgement that LL(k) is "a path to failure", you could hardly call this evidence or even a general consensus. To put things into perspective here, my answer above was not making the assertion that Grammatica was the only or even best solution but it's just another *potential* option which certainly is not wrong. If you are such an *expert* on this subject where is your answer posting?
James
I didn't post an answer, because I'm not an expert on what works for English. What I do know, from building several parser generators (including LL(k), LALR, and GLR) parser generators and reading a lot of literature related to these, is that the NL community decided they were inadequeate, and they have moved on to other things. I merely made that observation. If you believe that LL(k) parsers are good for parsing English, you should be able to exhibit one that somebody built, or exhibit one of your own.
Ira Baxter
... one of my first comments was that English often has ambiguous phrases: "Time flies like an arrow". Ambiguous means "more than one legitimate parse". A strict LL(k) parser can only produce one parse, and therefore must fail to "parse" this sentence properly. Can you hack in extra stuff ad hoc stuff like dictionary lookups to help? Sure. But then what you haven't isn't strict LL(k). This is one of the reasons the NLP people gave up on most compiler-type parsers: they handle ambiguity badly. You can't just supply "more rules".
Ira Baxter