tags:

views:

232

answers:

2

I'm very new to Python, and am trying to learn in conjunction with using nltk.

I've been following some examples and testing things out, but it seems I am very limited in what I can do due to errors being returned by python.

I know nltk is installed and importing fine, because this code works

from nltk.sem import chat80
print chat80.items

However, 'from nltk.tokenizer import *' returns

'File "stdin", line1. 
I get similar errors when using any sort of "TOKEN=" or I'm guessing tokenization of anything.

I've installed python many times in the last few days, hoping a different version or better install might help.

I'm getting this error on windows7 using activePython2.6, though I've gotten similar err ors with python 3.1 activePython3.1 and Python 2.6. as well as on Mac OSx 10.5 with Python 2.5.

The mac is giving a bit more data with "Import Error: No module named tokenizer.

I'm just trying some of the introductory demos to nltk online, not even trying to write my own code yet, and I'm getting more errors than successes.

+2  A: 

Looks like the nltp package doesn't have a tokenizer package.

A quick look on the NLTK website suggests that from nltp.tokenize import * is what you're after.

Adam
Thanks Adam. I was using 'tokenizer' as I've seen that in many examples, like this 'Getting Started with nltk' http://www.ibm.com/developerworks/linux/library/l-cpnltk.html - using tokenize instead of tokenizer fails when trying to define Token. So I'm thinking maybe there is something to tokenizer rather than tokenize.
pedalpete
A: 

Adam's answer may well be correct for your immediate "tokenizer" problem. Here's some general advice:

It helps when one is in unfamiliar territory to read the road signs e.g. this at the top of the Downloads page: """Although Python 3.0 is now available, many packages that NLTK requires do not have distributions for Python 3.0. For now you should use NLTK with Python 2.4., 2.5., or 2.6.* only.""" ... that would have saved you the effort trying Python 3.1. Moreover, trying to learn Python 2.x and 3.x at the same time is a bit too much for a novice.

"""I've installed python many times in the last few days, hoping a different version or better install might help""" ... repeated installations of the same version is unlikely to help.

"""However, from nltk.tokenizer import * returns File "stdin", line1 """ ... when asking for help, show your input and ALL of the output e.g.

>>> from nosuchthing import *
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named nosuchthing
>>>

and don't type from memory; use copy/paste.

When faced with a problem, plan your investigation of possible causes. Look at those with high plausibility and low cost of investigation (e.g. typo or other transcription error) first. I can't recall where I read this advice, but it's worth remembering: "Before you blame acts of God and acts of Gates, check for acts of self".

John Machin
Though you are correct John, I did read through the documentation, but kept running into issues with either nltk or easy_install, or pretty much anything else i was doing beyond the basics of 'print' or '2 + 2' in python. this is why I was installing different versions to see if I could get one to work, and why I finally stuck with 2.6 on PC and 2.5 on Mac, when everything else was failing. Also the 'tokenizer' is regularly used in all the code examples I've looked at, even from the nltk group. hence my trying to use that rather than tokenize.
pedalpete
@pedalpete: You say that you read the docs. Have another look at `http://nltk.googlecode.com/svn/trunk/doc/api/index.html` ... do you see a tokenizer module or a tokenize module? That developerworks article that you mentioned in another comment is over 5 years old. A very quick look at the svn repo shows tokenizer in nltk-old with a 2004 date on it. Hint: always go by what current documentation tells you, not on what you dredge up on the web. Expect projects to go through major changes with severe backwards compatability issues. BTW have you looked through the HOWTO section on the website?
John Machin