I have a python module that makes use of a huge dictionary global variable, currently I put the computation code in the top section, every first time import or reload of the module takes more then one minute which is totally unacceptable. How can I save the computation result somewhere so that the next import/reload doesn't have to compu...
I want to use the nltk libraries in c++.
Is there a glue language/mechanism I can use to do this?
Reason:
I havent done any serious programming in c++ for a while and want to revise NLP concepts at the same time.
Thanks
...
I'm playing about with the Natural Language Toolkit (NLTK).
The documentation (Book and HOWTO) and is a little heavy going. Are there any good but basic examples of the use of NLTK? I'm thinking of things like the NTLK articles on the Stream Hacker blog.
...
So what I'm trying to do is replace a string "keyword" with
"<b>keyword</b>"
in a larger string.
Example:
myString = "HI there. You should higher that person for the job. Hi hi."
keyword = "hi"
result I would want would be:
result = "<b>HI</b> there. You should higher that person for the job.
<b>Hi</b> <b>hi</b>."
I will not...
I'm having trouble with the NLTK under Python, specifically the .generate() method.
generate(self, length=100)
Print random text, generated using a trigram language model.
Parameters:
* length (int) - The length of text to generate (default=100)
Here is a simplified version of what I am attempting.
import nltk
word...
My goal is to analyze some corpus (twitter for the now) for emotional content. Just today I realized it would make a bit of sense to search for word stems as opposed to having an exhaustive list of emotional word stems. And so I've been exploring nltk.stem only to realize that there are 4 different stemmers. I'd like to ask the stackover...
I have been trying to make the NLTK (Natural Language Toolkit) work on the Google App Engine. The steps I followed are:
Download the installer and run it (a .dmg file, as I am using a Mac).
copy the nltk folder out of the python site-packages directory and place it as a sub-folder in my project folder.
Create a python module in the fo...
I know of NLTK. What else is there that complements this library? Or can do AI?
NLTK is great because I can learn it with the book that it came out.
Is there a library for AI just like this?
...
I am using NLTK to extract nouns from a text-string starting with the following command:
tagged_text = nltk.pos_tag(nltk.Text(nltk.word_tokenize(some_string)))
It works fine in English. Is there an easy way to make it work for German as well? (I have no experience with natural language programming, but I managed to use the python nl...
I'm very new to Python, and am trying to learn in conjunction with using nltk.
I've been following some examples and testing things out, but it seems I am very limited in what I can do due to errors being returned by python.
I know nltk is installed and importing fine, because this code works
from nltk.sem import chat80
print chat8...
I am using their default POS tagging and default tokenization..and it seems sufficient. I'd like their default chunker too.
I am reading the NLTK toolkit book, but it does not seem like they have a default chunker?
...
I am trying to parse some text and diagram it, like you would a sentence. I am new to NLTK and am trying to find something in NLTK that will help me accomplish this. So far, I have seen nltk.ne_chunk and nltk.pos_tag. I find them to be not very helpful and I am not able to find any good online documentation.
I have also tried to use the...
I love to eat chicken.
Today I went running, swimming and played basketball.
My objective is to return FOOD and SPORTS just by analyzing these two sentences. How can you do that?
I am familiar with NLP and Wordnet. But is there something more high-level/practical/modern technology??
Is there anything that automatically categorizes w...
When do I use each ?
Also...is the NLTK lemmatization dependent upon Parts of Speech?
Wouldn't it be more accurate if it was?
...
I'm currently looking at python because I really like the text parsing capabilities and the nltk library, but traditionally I am a .Net/C# programmer. I don't think IronPython is an integration point for me because I am using NLTK and presumably would need a port of that library to the CLR. I've looked a little at Python for .NET and w...
how can I tell nltk to treat the text in a particular language?
BKG: once in a while i write a specialized NLP routine to do POS tagging, tokenizing etc. on a non-english (but still hindo-european) text domain.
this question seem to address only different corpora, not the change in code / settings:
http://stackoverflow.com/questions/16...
I am reading this book (NLTK) and it is confusing. Entropy is defined as:
Entropy is the sum of the probability of each label
times the log probability of that same label
How can I apply entropy and maximum entropy in terms of text mining? Can someone give me a easy, simple example (visual)?
...
I'm trying to load some corpora I installed with the NLTK installer but I got a:
>>> from nltk.corpus import machado
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name machado
But in the download manager (nltk.download()) the package machado is marked as installed a...
Hello,
I am hand tagging twitter messages as Positive, Negative, Neutral. I am try to appreciate is there some logic one can use to identify of the training set what proportion of message should be positive / negative and neutral ?
So for e.g. if I am training a Naive Bayes classifier with 1000 twitter messages should the proportion o...
Is there a research paper/book that I can read which can tell me for the problem at hand what sort of feature selection algorithm would work best.
I am trying to simply identify twitter messages as pos/neg (to begin with). I started out with Frequency based feature selection (having started with NLTK book) but soon realised that for a ...