views:

42

answers:

1

So far I have managed to write some code that should print the source of the page. The problem is, it doesn't. I tried it with another web site, and it printed it out fine, so I used wget on the page "http://www.whitepages.com/carrier_lookup?carrier=other&number_0=2165138899&response=1" which should download the page for me. It gave " ERROR 403: Forbidden. ", so I'm not really sure how to access the html now.

The second part of the problem is that when I manage to download the html and save it as a string, I need to save as a different string the carrier that the search found. This is accessible as the line under the [div class="carrier_result"] line in the source code. In the previous sentence I replaced the < and > with brackets because sourceforge would not let me post the html.

So far the code I have is: http://pastebin.com/u4HUv3Rj

Thanks to anyone who helps me with this.

+2  A: 

For an explanation of what a 403 result from HTTP means, and how to deal with it, see here.

I have no idea what "I need to save as a different string the carrier that the search found" can possibly mean -- I can't even parse it as an English sentence, nor do I know what "the line under the line" means either. Please rephrase (if English isn't your native language, I can try grokking Italian, French, Spanish, German, or Latin -- in decreasing probability and with no guarantee of success, but it can't be worse than w/your current phrasing;-).

Alex Martelli
Sorry about my ambiguous word choice. I will try to describe what I need to do more clearly.I have a program that needs to find which carrier the entered cellphone number is on.Since I can directly manipulate in python the url of the http://www.whitepages.com/carrier_lookup website to lookup a specific phone number, I figured that there has to be some way to read a certain line in the source of the page.Looking through the source of the outputted page, I discovered that the name of the carrier is on the line following the <div class="carrier_result"> tag.
ErikT
Use BeautifulSoup to locate that specific `div` tag and get its contents, see http://www.crummy.com/software/BeautifulSoup/ .
Alex Martelli
Thank you. That might be exactly what I was looking for. I'll check it out now.
ErikT