views:

33

answers:

1

Can anyone suggest a good source of names that I can use to help analyze some tables on web pages. The first column of the tables I am scraping have names alone, names and titles or just titles. The names can be as varied as John Smith to Vikram Saksena. I have been poking around for a compiled list of words that can be found in proper names.

Edited I have tried the name set from the Census and it has so much garbage in it that its not worth working with.

+1  A: 

Download the Febrl project source code.

It's data folder contains tables for names (given/middle/surnames/etc). You may have to massage the data for your own needs.

For surnames you can check around for U.S. Census data. I don't have the link right now, but know I've used the common U.S. surnames from that source before.

Robert
Thanks I will look at the folder
PyNEwbie