views:

36

answers:

1

I'm trying to import the US Census cartographic boundary files (available here: http://www.census.gov/geo/www/cob/bdy_files.html ) into a GeoDjango application. However, python is complaining about UnicodeDecodeErrors (for example, for the non-ascii characters in Puerto Rico).

The shapefile description file (*.dbf) doesn't specify what character encoding it uses; this is not defined by the spec for shapefiles. What is the correct character encoding to use?

A: 

The US Census cartographic boundary files use the IBM850 character encoding. Python code to properly encode these strings would be as follows:

unicode(featurestring.decode("IBM850"))