Character encoding for US Census Cartographic Boundary Files | ansaurus

tags:

views:

36

answers:

1

Q:

Character encoding for US Census Cartographic Boundary Files

I'm trying to import the US Census cartographic boundary files (available here: http://www.census.gov/geo/www/cob/bdy_files.html ) into a GeoDjango application. However, python is complaining about UnicodeDecodeErrors (for example, for the non-ascii characters in Puerto Rico).

The shapefile description file (*.dbf) doesn't specify what character encoding it uses; this is not defined by the spec for shapefiles. What is the correct character encoding to use?

A:

The US Census cartographic boundary files use the IBM850 character encoding. Python code to properly encode these strings would be as follows:

unicode(featurestring.decode("IBM850"))

2010-03-19 12:54:24

related questions

Difference between VARCHAR2(11 BYTE) and VARCHAR2(11 CHAR)

How to remove these kind of symbols (junk) from string?

Problem with unicode String literal in unit test

How to replace a character programatically in Oracle 8.x series

Best way to convert text files between character sets?

international characters in Javascript

What do I need to know to globalize an asp.net application?

Are you fluent in Unicode yet?

Unicode in C++

Getting international characters from a web page?

MySQL UTF/Unicode migration tips

Reading Email using Pop3 in C#

cross platform unicode support

Formatting tabular data using unicode characters

'Reliable' SMS Unicode & GSM Encoding in PHP

How do I put unicode characters in my Antlr grammar?

Are named entities in HTML still necessary in the age of Unicode aware browsers?

Unicode vs UTF-8 confusion in Python / Django?

Regex and unicode

How can I get Unicode characters to display properly for the tooltip for the IMG ALT in IE7?

String To Lower/Upper in C++

How to display unicode text in OpenGL?

Is it just me, or are characters being rendered incorrectly more lately?

Python, Unicode, and the Windows console

Internationalization in your projects