views:

540

answers:

3

The only reliable method that I a have found for using a script to download text from wikipedia is with cURL. So far the only way I have for doing that is to call os.system(). Even though the output appears properly in the python shell I can't seem to the function it to return anything other than the exit code(0). Alternately somebody could show be how to properly use urllib.

+2  A: 

Answering the question, Python has a subprocess module which allows you to interact with spawned processes.http://docs.python.org/library/subprocess.html#subprocess.Popen

It allows you to read the stdout for the invoked process, and even send items to the stdin.

however as you said urllib is a much better option. if you search stackoverflow i am sure you will find at least 10 other related questions...

Cipher
A: 

As an alternetive to urllib, you could use the libCurl Python bindings.

gnud
+6  A: 

From Dive into Python:

import urllib
sock = urllib.urlopen("http://en.wikipedia.org/wiki/Python_(programming_language)")
htmlsource = sock.read()
sock.close()
print htmlsource

That will print out the source code for the Python Wikipedia article. I suggest you take a look at Dive into Python for more details.

Example using urllib2 from the Python Library Reference:

import urllib2
f = urllib2.urlopen('http://www.python.org/')
print f.read(100)

Edit: Also you might want to take a look at wget.
Edit2: Added urllib2 example based on S.Lott's advice

Sean
Thank you, the built in help browser is almost never understandable.
GameFreak
urllib2 does almost the same thing, plus it handles things like redirects more gracefully.
S.Lott
@S.Lott I agree. I was just looking for a resource that GameFreak could learn more from, not just copy code from, and it turned out that the first resource I thought of, Dive into Python, used urllib.
Sean
http://www.python.org/doc/2.5.2/lib/urllib2-examples.html seem pretty clear.
S.Lott