tags:

views:

11612

answers:

7

I want my python function to split a sentence (input) and store each word in a list. The code that I've written so far splits the sentence, but does not store the words as a list. How do I do that?

def split_line(text):

    # split the text
    words = text.split()

    # for each word in the line:
    for word in words:

        # print the word
        print(words)
+11  A: 
words = text.split()

this is your list

zalew
off course!! Sorry for asking such an obvius question
Thanx
+4  A: 

text.split() should be enough to store each word in a list. words is already a list of the words from the sentence, so there is no need for the loop. Second, it might be a typo, but you have your loop a little messed up. If you really did want to use append, it would be

words.append(word)

not

word.append(words)

nstehr
+3  A: 

string.split() returns a list:

Return a list of the words of the string s. If the optional second argument sep is absent or None, the words are separated by arbitrary strings of whitespace characters (space, tab, newline, return, formfeed).

>>> line="a sentence with a few words"
>>> line.split()
['a', 'sentence', 'with', 'a', 'few', 'words']
>>>
gimel
+1  A: 

i think u r confused because of a typo.
replace print(words) with print(word) inside your loop to have every word printed on a different line

adi92
+1  A: 

I want my python function to split a sentence (input) and store each word in a list

The str().split() method does this, it takes a string, splits it into a list:

>>> the_string = "this is a sentence"
>>> words = the_string.split(" ")
>>> print(words)
['this', 'is', 'a', 'sentence']
>>> type(words)
<type 'list'> # or <class 'list'> in Python 3.0

The problem you're having is because of a typo, you wrote print(words) instead of print(word):

Renaming the word variable to current_word, this is what you had:

def split_line(text):
    words = text.split()
    for current_word in words:
        print(words)

..when you should have done:

def split_line(text):
    words = text.split()
    for current_word in words:
        print(current_word)

If for some reason you want to manually construct a list in the for loop, you would use the list append() method, perhaps because you want to lower-case all words (for example):

my_list = [] # make empty list
for current_word in words:
    my_list.append(current_word.lower())

Or more a bit neater, using a list-comprehension:

my_list = [current_word.lower() for current_word in words]
dbr
+1  A: 

Depending on what you plan to do with your sentence-as-a-list, you may want to look at the Natural Language Took Kit. It deals heavily with text processing and evaluation. You can also use it to solve your problem:

import nltk
words = nltk.word_tokenize(raw_sentence)

This has the added benefit of splitting out punctuation.

Example:

>>> import nltk
>>> s = "The fox's foot grazed the sleeping dog, waking it."
>>> words = nltk.word_tokenize(s)
>>> words
['The', 'fox', "'s", 'foot', 'grazed', 'the', 'sleeping', 'dog', ',', 
'waking', 'it', '.']

This allows you to filter out any punctuation you don't want and use only words.

Please note that the other solutions using string.split() are better if you don't plan on doing any complex manipulation of the sentance.

tgray
A: 

ok, how do i reverse that? i.e: ['The', 'fox', "'s", 'foot', 'grazed', 'the', 'sleeping', 'dog', ',', 'waking', 'it', '.']

to:

"The fox's foot grazed the sleeping dog, waking it."

Alyssa