views:

157

answers:

2

Hi all! i have installed lxml2.2.2 on windows platform(i m using python version 2.6.5).i tried this simple command:

from lxml.html import parse 
p= parse(‘http://www.google.com’).getroot()

but i am getting the following error:

Traceback (most recent call last): File “”, line 1, in p=parse(‘http://www.google.com’).getroot() File “C:\Python26\lib\site-packages\lxml-2.2.2-py2.6-win32.egg\lxml\html_init_.py”, line 661, in parse return etree.parse(filenameorurl, parser, baseurl=baseurl, **kw) File “lxml.etree.pyx”, line 2698, in lxml.etree.parse (src/lxml/lxml.etree.c:49590) File “parser.pxi”, line 1491, in lxml.etree.parseDocument (src/lxml/lxml.etree.c:71205) File “parser.pxi”, line 1520, in lxml.etree.parseDocumentFromURL (src/lxml/lxml.etree.c:71488) File “parser.pxi”, line 1420, in lxml.etree.parseDocFromFile (src/lxml/lxml.etree.c:70583) File “parser.pxi”, line 975, in lxml.etree.BaseParser.parseDocFrom File (src/lxml/lxml.etree.c:67736) File “parser.pxi”, line 539, in lxml.etree.ParserContext.handleParseResultDoc (src/lxml/lxml.etree.c:63820) File “parser.pxi”, line 625, in lxml.etree.handleParseResult (src/lxml/lxml.etree.c:64741) File “parser.pxi”, line 563, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:64056) IOError: Error reading file ‘http://www.google.com’: failed to load external entity “http://www.google.com”

i am clueless as to what to do next as i am a newbie to python. please guide me to solve this error. thanks in advance!! :)

A: 

check wether you are able to open the http://www.google.com might be your internet connection is down.

Tumbleweed
A: 

lxml.html.parse does not fetch URLs.

Here's how to do it with urllib2:

>>> from urllib2 import urlopen
>>> from lxml.html import parse
>>> page = urlopen('http://www.google.com')
>>> p = parse(page)
>>> p.getroot()
<Element html at 1304050>
MattH
thank you very much for clarifying!!this works great :)
pythonisgr8
You are very welcome!
MattH