I'm playing about with the Natural Language Toolkit (NLTK).
The documentation (Book and HOWTO) and is a little heavy going. Are there any good but basic examples of the use of NLTK? I'm thinking of things like the NTLK articles on the Stream Hacker blog.
...
I'm looking for an open source implementation, preferably in python, of Textual Sentiment Analysis (http://en.wikipedia.org/wiki/Sentiment_analysis). Is anyone familiar with such open source implementation I can use?
I'm writing an application that searches twitter for some search term, say "youtube", and counts "happy" tweets vs. "sad"...
Does anybody know of something similar to Date.js in Ruby? Something that would be able to return a date object from something like: "two weeks from today". The Remember the Milk webapp incorporates this feature into their system and it is incredibly easy to use.
I would use the Date.js library itself but because it is on the client sid...
I am trying to find words (specifically physical objects) related to a single word. For example:
Tennis: tennis racket, tennis ball, tennis shoe
Snooker: snooker cue, snooker ball, chalk
Chess: chessboard, chess piece
Bookcase: book
I have tried to use WordNet, specifically the meronym semantic relationship; however, this method is...
I am looking to use a natural language parsing library for a simple chat bot. I can get the Parts of Speech tags, but I always wonder. What do you do with the POS. If I know the parts of the speech, what then?
I guess it would help with the responses. But what data structures and architecture could I use.
...
Why in some countries there is a comma separator and in some dot? Do you know what is the reason of that? It's very annoying to check every time if you should use this or this.
...
I'm trying to parse a string in a self-made language into a sort of tree, e.g.:
# a * b1 b2 -> c * d1 d2 -> e # f1 f2 * g
should result in:
# a
* b1 b2
-> c
* d1 d2
-> e
# f1 f2
* g
#, * and -> are symbols. a, b1, etc. are texts.
Since the moment I know only rpn method to evaluate expressions, and my current solution...
I had a work for the university which basically said:
"Demonstrates that the non-regular language L={0^n 1^n : n natural} had no infinite regular sublanguages."
I demonstrated this by contradiction. I basically said that there is a language S which is a sublanguage of L and it is a regular language. Since the possible Regular expre...
I have a large number of text files (1000+) each containing an article from an academic journal. Unfortunately each article's file also contains a "stub" from the end of the previous article (at the beginning) and from the beginning of the next article (at the end).
I need to remove these stubs in preparation for running a frequency an...
Hello!
I have a sentence, for example
John Doe moved to New York last year.
Now I split the sentence into the single words and I get:
array('John', 'Doe', 'moved', 'to', 'New', 'York', 'last', 'year')
That's quite easy. But then I want to combine the single words to get all the composed terms. It doesn't if the composed term...
Hello!
The German website nandoo.net offers the possibility to shorten a news article. If you change the percentage value with a slider, the text changes and some sentences are left out.
You can see that in action here:
http://www.nandoo.net/read/article/299925/
The news article is on the left side and tags are marked. The slider...
Is there any library that can be used for analyzing (nlp) simple english text. For example it would be perfect if it can do that;
Input: "I am going"
Output: I, go, present continuous tense
...
I am having a hard time to find a way to detect if two words has the same rhyme in English. It has not to be the same syllabic ending but something closer to phonetically similarity.
I can not believe in 2009 the only way of doing it is using those old fashioned rhyme dictionaries. Do you know any resources (in PHP would be a plus) to ...
Been Googling around without finding much at all, so does anyone know of a class or library that helps you parse any sort of language, like a Domain Specific Language (I'm creating one, so I'm flexible in what the syntax and format can be) into either PHP code or some helpful struct or a class hiearchy or ... ? Anything goes at this poin...
I'm preparing some table names for an ORM, and I want to turn plural table names into single entity names. My only problem is finding an algorithm that does it reliably. Here's what I'm doing right now:
If a word ends with -ies, I replace the ending with -y
If a word ends with -es, I remove this ending. This doesn't always work however...
I'm looking for a culturally-sensitive way to properly insert a noun into a sentence while using the appropriate article (a/an). It could use String.Format, or possibly something else if the appropriate way to do this exists elsewhere.
For example:
Base Sentence: "You are looking at a/an {0}"
This should format to: "You are looking at...
I am looking at writing a compiler and after I complete something in a "C" style I am looking at adapting it to other models. What are some syntactical constructs you would expect to see in a "natural" programming language?
The target platform for this compiler will be the CLR and I am currently using Oslo+MGrammar for the lexer/pars...
I assume a natural language processor would need to be used to parse the text itself, but what suggestions do you have for an algorithm to detect a user's mood based on text that they have written? I doubt it would be very accurate, but I'm still interested nonetheless.
EDIT: I am by no means an expert on linguistics or natural language...
The most common part-of-speech tagset for German is the STTS tagset. I need an English translation of the explanations for each tag. Not being a linguist I don't feel comfortable (let alone qualified) for translating this myself.
Google turned up nothing, so any help is appreciated.
...
I have a table which indexes the locations of words in a bunch of documents.
I want to identify the most common bigrams in the set.
How would you do this in MSSQL 2008?
the table has the following structure:
LocationID -> DocID -> WordID -> Location
I have thought about trying to do some kind of complicated join... and i'm just doing...