nltk

Problem with NLTK import

Hi, Python crashes while using the "import nltk" command. All other import commands word. Tried re-installing with various sequences but couldn't resolve the issue. Python version: 2.6.5 NLTK version: 2.0b8 Following packages/libraries installed: PyYAML NumPy SciPy Matplotlib easy_install Regards, --Denzil ...

NLTK and language detection

How do I detect what language a text is written in using NLTK? The examples I've seen use nltk.detect, but when I've installed it on my mac, I cannot find this package. Cheers Nik ...

how can we run python script(which uses nltk and scrapy) from java

Hi all! I have written python scripts that use scrapy,nltk and simplejson in my project but i need to run them from java as my mentor wants to deploy them on a server and i have very less time to do this.I took a glance at runtime.exec() in java and jython, needless to say that running system commands from java doesn't look simple eithe...

Free Tagged Corpus for Named Entity Recognition

Hey guys, I am looking for a free tagged corpus for a system to train on to for Named Entity Recognition. Most of the ones I find (like the New York Times one) are expensive and not open. Can anyone help? ...

Using ChunkedCorpusReader in nltk

Hello, can someone please post some kind of example for using this to read a file. for example: the fox{{some_tag}} jumped over the((some_other_tag)) lazy dog. the api for it is in: http://nltk.googlecode.com/svn/trunk/doc/api/nltk.corpus.reader.chunked.ChunkedCorpusReader-class.html i can get it to read files and split them over com...

Extracting a set of words with the Python/NLTK, then comparing it to a standard English dictionary.

I have: from __future__ import division import nltk, re, pprint f = open('/home/a/Desktop/Projects/FinnegansWake/JamesJoyce-FinnegansWake.txt') raw = f.read() tokens = nltk.wordpunct_tokenize(raw) text = nltk.Text(tokens) words = [w.lower() for w in text] f2 = open('/home/a/Desktop/Projects/FinnegansWake/catted-several-long-Russian-nov...

Detect English verb tenses using NTLK

I am looking for a way given an English text count verb phrases in it in past, present and future tenses. For now I am using NLTK, do a POS (Part-Of-Speech) tagging, and then count say 'VBD' to get past tenses. This is not accurate enough though, so I guess I need to go further and use chunking, then analyze VP-chunks for specific tense ...

Can I use NLTK to determine if a comment is a positive one or a negative one?

Can you show me a simple example using http://www.nltk.org/code to determine if a string about a happy or upset mood? ...

Break/Decompose complex and compound sentences in nltk

Is there a way to decompose complex sentences into simple sentences in nltk or other natural language processing libraries? For example: The park is so wonderful when the sun is setting and a cool breeze is blowing ==> The sun is setting. a cool breeze is blowing. The park is so wonderful. ...

How to config nltk data directory from code?

How to config nltk data directory from code? ...

Project Gutenberg Python problem ?

Hello everyone, I am trying to process various texts by regex and NLTK of python -which is at http://www.nltk.org/book-. I am trying to create a random text generator and I am having a hard time with a problem. First, here is my algorithm: Enter a sentence as input -this is called trigger string- Get longest word in trigger string Sear...

I have text files in multiple languages. How to selectively delete one language in NLTK?

Maybe this is just impossible and I should give up all hope. Or maybe there's a really clever way to do it that I haven't thought of. Here's two examples of what I've got: يَبِسَ - يَيْبَسُ (yabisa, yaybasu)[y-b-s][ي-ب-س] (To become dry, stiff, rigid) 20:77 yabasan = dry. يَسَّرَ - يُيَسِّرُ (yassara, yuyassiru)[y-s-r][ي-س-ر...

Python code flow does not work as expected ?

Hello everyone, I am trying to process various texts by regex and NLTK of python -which is at http://www.nltk.org/book-. I am trying to create a random text generator and I am having a slight problem. Firstly, here is my code flow: Enter a sentence as input -this is called trigger string, is assigned to a variable- Get longest word in ...

Transforming early modern English into 20th century spelling using the NLTK

I have a list of strings that are all early modern English words ending with 'th.' These include hath, appointeth, demandeth, etc. -- they are all conjugated for the third person singular. As part of a much larger project (using my computer to convert the Gutenberg etext of Gargantua and Pantagruel into something more like 20th century ...

How to make this random text generator more efficient in Python ?

Hello everyone, I'm working on a random text generator -without using Markov chains- and currently it works without too many problems. Firstly, here is my code flow: Enter a sentence as input -this is called trigger string, is assigned to a variable- Get longest word in trigger string Search all Project Gutenberg database for sentences...

How can I randomize this text generator even further ?

Hello everyone, I'm working on a random text generator -without using Markov chains- and currently it works without too many problems -actually generates a good amount of random sentences by my criteria but I want to make it even more accurate to prevent as many sentence repeats as possible-. Firstly, here is my code flow: 1-Enter a sen...

It's probably simpler in awk, but how can I say this in Python?

I have: Rutsch is for rutterman ramping his roe which is a phrase from Finnegans Wake. The epic riddle book is full of leitmotives like this, such as 'take off that white hat,' and 'tip,' all which get mutated into similar sounding words depending on where you are in the book itself. All I want is a way to find obvious occurrences of t...

Word sense disambiguation in NLTK Python

Hello friends, I am new to NLTK Python and i am looking for some sample application which can do word sense disambiguation. I have got a lot of algorithms in search results but not a sample application. I just want to pass a sentence and want to know the sense of each word by referring to wordnet library. Thanks I have found a similar...

Creating a Python function that opens a textfile, reads it, tokenizes it, and finally runs from the command line or as a module

I have been trying to learn Python for a while now. By chance, I happened across chapter 6 of the official tutorial through a Google search link pointing here. When I learned, from that page, that functions were the heart of modules, and that modules could be called from the command line, I was all ears. Here's my first attempt at doing ...

Using Nltk and Wordnet how do i convert simple tense verb into its present, past or past participle form?

Hi Using Nltk and Wordnet how do i convert simple tense verb into its present, past or past participle form? For example: I want to write a function which would give me verb in expected form as follows. v = 'go' present = present_tense(v) print present # prints "going" past = past_tense(v) print past # prints "went" Any suggestion...