Is there an open source Java library/algorithm for finding if a particular piece of text is a question or not?
I am working on a question answering system that needs to analyze if the text input by user is a question.
I think the problem can probably be solved by using opensource NLP libraries but its obviously more complicated than simple part of speech tagging. So if someone can instead tell the algorithm for it by using an existing opensource NLP library, that would be good too.
Also let me know if you know a library/toolkit that uses data mining to solve this problem. Although it will be difficult to get sufficient data for training purposes, I will be able to use stack exchange data for training.
Update:
I have given up on NLP libraries. I tried to use uClassify (http://www.uclassify.com) for text classification and trained my classifier with 100000 stack overflow questions/answers. It is still not very useful. "I do what I want" is classified as question while "You do what you want" is classified as answer.
So if anyone can point me to a good training dataset, that will be great as well.
Are there any other alternatives?