views:

83

answers:

2

I have an old customer list of 4,000 businesses. I want to determine if the phone numbers associated with each listing are still working (and therefore the business is probably still open). I can put each number in whitepages.com and check them one by one... but want to automate the results. I have looked at their API and can't digest it. I can form the correct query URL, but trying things like cURL -O doesn't work.

I have access to Mac tools, Unix tools, and could try various javascript stuff if anyone could point me in the right direction... would even pay. Help?

Thx

+4  A: 

As per Pekka's comment, most companies with a public API don't allow scraping in their terms of service, so it's quite possible that performing 4k GET requests to their website will flag you as a malicious user and get you blacklisted!

Their API is RESTful and seems simple and pretty well documented, definitely try to get that working instead of going the other way. A good first attempt after getting your API key would be to write a UNIX script to perform a reverse phone number lookup. For example, suppose you had all 4000 10-digit phone numbers in a flat text file, one per line with no formatting, you could write a simple bash script as follows:

#!/bin/bash
INPUT_FILE=phone_numbers.txt 
OUTPUT_DIR=output 
API_KEY='MyWhitePages.comApiKey' 
BASE_URL='http://api.whitepages.com' 

# Perform a reverse lookup on each phone number in the input file. 
for PHONE in $(cat $INPUT_FILE); do 
  URL="${BASE_URL}/reverse_phone/1.0/?phone=${PHONE};api_key=${API_KEY}" 
  curl $URL > "${OUTPUT}/result-${PHONE}.xml"
done 

Once you've retrieved all the results you can either parse the XML to analyze the matching businesses, or if you're just interested in existence you could simply grep each output file for the string The search did not find results which, from the WhitePages.com API, indicates no match. If the grep succeeds then the business doesn't exist (or changed its phone number), otherwise it's probably still around (or another business exists with that phone number).

maerics
This is very nice. Thanks for the pointers maerics! I've tried their Pro feature and it never loads my .xls file. We moved on to doing it by hand (eek, I know), but we ignorant masses soldier on...
John Corbin
+2  A: 

Hi John,

As others have noted, it is a tos violation to scrape our website or to store the data returned from the api. However, you can get the data you want from our pro service at: https://pro.whitepages.com/list-update/upload_file

Dan
Whitepages API lead.

Dan
Hi, Dan,I discovered this. However, when I try to upload my file, as made per the instructions, it never uploads using Safari, the UI icon just spins. We've now done nearly 3000 by hand via copy and paste.
John Corbin
Hi John, I'm not sure we support safari, have you tried it with firefox?
Dan