views:

85

answers:

1

I have the following code:

f = open(path, 'r')
html = f.read() # no parameters => reads to eof and returns string

soup = BeautifulSoup(html)
schoolname = soup.findAll(attrs={'id':'ctl00_ContentPlaceHolder1_SchoolProfileUserControl_SchoolHeaderLabel'})
print schoolname

which gives:

[<span id="ctl00_ContentPlaceHolder1_SchoolProfileUserControl_SchoolHeaderLabel">A B Paterson College, Arundel, QLD</span>]

when I try and access the value (i.e. 'A B Paterson College, Arundel, QLD) by using schoolname['value'] I get the following error:

print schoolname['value'] TypeError: list indices must be integers, not str

What am I doing wrong to get that value?

+1  A: 

You can use contents to move down the tree:

>>> for x in schoolname:
>>>    print x.contents
[u'A B Paterson College, Arundel, QLD']    

Note that the contents doesn't necessarily have to be a string - in general it could also be more tags or a mixture of string and tags.

Mark Byers