views:

73

answers:

3

I need to figure out whether not a given location is considered urban or rural. I take it that the best way to do this is by looking at the population density of the city/state or province/country combination.

The kicker is that we're using this for data mining. Generally, mapping APIs that could do this have a requirement that each request must be in response to a single user action. This doesn't fit that criteria...using a web service, we would be making hundreds of web service calls for any single user action. So I think we can't really use something like the Google Maps API.

The problem is, what is available? Are there any databases ready to download which I can use to retrieve this data, or web services that actually allow data mining? I am using PHP, though the programming language doesn't really matter. I'm sure if I can get the data, I can get it to work with PHP.

+1  A: 

I don't know if there are any freely available ready-made databases providing this information out there.

You could download a DBpedia dataset, specifically the infobox dataset, and extract population/location data from that.

Michael Robinson
Thanks! I think I'll use this one. I've downloaded the mapping properties dataset, grepped out the population density lines, and will write a Python script to parse out the excess stuff to squash the filesize.
Tristan Crockett
A lot of useful (and accessible) info in those files :)
Michael Robinson
+1  A: 

Once I was browsing Google using query "city filetype:sql" and found very interesting database containing about 4000 largest cities and other cool geographical data.

Have a look here: http://www.dbis.informatik.uni-goettingen.de/Mondial/

Nazariy
+2  A: 

Are there any databases ready to download which I can use to retrieve this data, or web services that actually allow data mining?

For the US:

You might want to take a look at the gridded 1 km population estimates for the conterminous United States by decade from 1930 - 2000. (some more info)


For the World:

It looks like you want something like the Gridded Population of the World, version 3 (GPWv3), and the Global Rural-Urban Mapping Project (GRUMP) datasets.

There's a stand alone SEDAC Map Client and data downloads (here's some urban rural estimates data in Excel format).

You can obtain population estimates within a defined region using the Population Estimation Service

Population Estimation Service Features:

The service is accessible through three standard protocols used by many online map tools and clients: the Open Geospatial Consortium (OGC) Web Processing Service (WPS) standard, a Representational State Transfer (REST) interface, and a Simple Object Access Protocol (SOAP) interface. Standards-based clients such as uDig are able to submit requests using the OGC WPS. Users of ArcGIS software from ESRI can submit requests through SOAP. The REST interface is intended for use with lightweight javascript clients.

The parametric statistics returned for each supplied polygon include the count (number of grid cells used in the analysis), minimum population count, maximum population count, range of population counts, mean population counts, and standard deviation of population counts. Two measures of data quality are included in the service results. The first measure reflects the precision of the input data and the second indicates when the requested polygons are too small in area compared with the underlying input data to produce reliable population statistics.

Access:

To access the Population Estimation Service, users need to work with an online map client or Geographic Information System (GIS) software package that supports spatial queries through one of the three supported protocols. The service interfaces are available at:

Web Processing Service (WPS)

http://sedac.ciesin.columbia.edu/wps/WebProcessingService?Request=GetCapabilities&Service=WPS

REST/SOAP Services http://sedac.ciesin.columbia.edu/mapservices/arcgis/rest/services/sedac/GPW/GPServer

Peter Ajtai