hi,
I've started to write a simple sentiment analysis tool
currently i am looking @ GATE (http://gate.ac.uk) and RapidMiner (http://rapid-i.com/)
Being a beginner not able to concentrate on both...
could someone pls tell me which one will be better in terms of usage, learning curve, licensing etc
Thx
Shiv
...
The title says it all: Given some (English) word that we shall assume is a plural, is it possible to derive the singular form? I'd like to avoid lookup/dictionary tables if possible.
Some examples:
Examples -> Example a simple 's' suffix
Glitch -> Glitches 'es' suffix, as opposed to above
Countries -> Country 'ies' suffix....
I have a set of documents in two languages: English and German. There is no usable meta information about these documents, a program can look at the content only. Based on that, the program has to decide which of the two languages the document is written in.
Is there any "standard" algorithm for this problem that can be implemented in a...
Is it possible to use the Google Wave Context-Aware Spell Checker via web services?
If yes, can anyone please be kind enough to post a simple example?
...
Hello, I am working in a natural language processing project. It aims to build libraries for Arabic language. We working on a POS tagger and now I am thinking in grammar phase. Since Arabic language and many others have complicated grammar, so it is very hard to build their context free grammar (CFG). For this reason I had an idea for an...
I am working on a somewhat large corpus with articles numbering the tens of thousands. I am currently using PDFBox to extract with various success, and I am looking for a way to programatically check each file to see if the extraction was moderately successful or not. I'm currently thinking of running a spellchecker on each of them, but ...
I run a website that allows users to write blog-post, I would really like to summarize the written content and use it to fill the <meta name="description".../>-tag for example.
What methods can I employ to automatically summarize/describe the contents of user generated content?
Are there any (preferably free) methods out there that have...
I'd like to use correlation clustering and I figure R is a good place to start.
I can present the data to R as a set of large, sparse vectors or as a table with a pre-computed dissimilarity matrix.
My question is are there existing R functions to turn this into a hierarchical cluster with agnes that uses correlation clustering?
Will I ...
I found "Natural Language Processing with Python" today, and am wondering what other good, non-academic (the research papers tend to be too dry and/or specific to certain areas) NLP resources the SO community knows about.
I'm starting-out in text processing for a couple hobby projects, and am keen to find good places to start :)
...
I have a list of requirements for a software project, assembled from the remains of its predecessor. Each requirement should map to one or more categories. Each of the categories consists of a group of keywords. What I'm trying to do is find an algorithm that would give me a score ranking which of the categories each requirement is likel...
What is the best profanity filter (free / open source or paid commercial) which supports Java integration?
It needs to be able to take a string and return a clean string... Can be a web service and doesn't necessarily have to support Java...
Happy programming...
...
Hi,
I'm looking for a package (any language, really) that I can use on a corpus of 50 documents to perform interdocument similarity testing in various metrics, like tfidf, okapi, language models, lsa, etc.
I want as a result a document similarity matrix, i.e. doc1 is x% similar to doc2, etc... This is for research purposes, not for pr...
I have downloaded AraMorph 1.2.1 Perl version from SourceForge, but I do not know how to use it. Could someone explain to me how can I get it to work?
...
Hi,
I am looking for a lemmatisation implementation for English in Java. I found a few already, but I need something that does not need to much memory to run (1 GB top).
Thanks.
I DO NOT NEED A STEMMER.
...
I'm creating an ELIZA-like chatterbot, and I'd like to calibrate it with Omegle, using what the other person type as the input.
If it was a regular HTML page, I could parse it and send back the response to some script, but checking the source code, I've noticed that the entire page is created using Javascript, but obfuscates the entire...
Does anyone know? Is this a place to ask Computer science questions or just programming?
...
I'm working on a project that already has a C++ base. I would like to have a plug-in for some natural language processing. I really like GATE but I'm not sure if it's worth launching the JVM and splitting the project into C++ and Java portions. I noticed UIMA has a C++ framework, but have not tried it but seems to have less features t...
I'm interested in developing a natural language command language for a domain with existing rules. I was very impressed when Terry Winograd's SHRDLU showed the way (the conversation below is 40 years old! Astonishing). Can we do better now and if so where can I get examples?
Person: Pick up a big red block.
Computer: OK.
Person: ...
I am using NLTK to extract nouns from a text-string starting with the following command:
tagged_text = nltk.pos_tag(nltk.Text(nltk.word_tokenize(some_string)))
It works fine in English. Is there an easy way to make it work for German as well? (I have no experience with natural language programming, but I managed to use the python nl...
Hello,
For my GAE app I need to do some natural language processing to extract the subject and object from an input sentence.
Apparently NLTK can't be installed (easily) on GAE so I am looking for another solution.
I noticed GAE comes with Antlr3 but from browsing their documentation it solves a different kind of grammar problem.
Any...