+1  A: 

This does seem like a case of double-encoding; I don't have much experience with Python, but try adjusting the MySQL connection settings as per the advice at http://tahpot.blogspot.com/2005/06/mysql-and-python-and-unicode.html

What I'm guessing is happening is that the connection is latin1, so MySQL tries to encode the string again before storage to the UTF-8 field. The code there, specifically this bit:

EDIT: With Python when establishing a database connection add the following flag: init_command='SET NAMES utf8'.

In addition set the following in MySQL's my.cnf: default-character-set = utf8

is probably what you want.

phsource
It's strange, but calling 'set names utf8' makes the problem worse. Leaving Django out of the picture, in a Python shell, it makes that character be the `\xc3\xa2\xe2\x82\xac\xe2\x80\x9c`. then if I call `set names latin1`, the character becomes `\xe2\x80\x93`. In PHP, it goes from – to –. So, having it set to latin1 actually makes it work fine in PHP. I'm pretty sure that Django calls `set names utf8` to prepare the connection, actually.
Alex JL
Aha, it appears I needed to call `set names` before inserting the data.
Alex JL
Inserting the data into php, that is. I'll go ahead and accept your answer (though I should note for future readers, the solution was to call `set names utf8` for the PHP connection, not the Python one)
Alex JL
A: 

I added set names utf8 to my php data insertion sequence, and now in a Python shell the feared ndash shows up as \x96. This renders correctly when read and output through Django.

One unusual situation about this is that I'm inserting data through PHP. Django issues set names utf8 automatically, so likely if I was inserting and reading the data through Django, this issue would not have appeared. PHP was using the default of latin1, I suppose

As an interesting note, while before I could read the data from PHP and it showed up normally in the browser, now the ndash is � unless I call set namesbefore reading the data.

So, it's working now and I hope I never have to understand whatever was going on before!

Alex JL
Yes, that is going to be a problem with your old data. If you can afford to take your DB offline for a bit, you can change the columns that are strings back to latin1; then, set them to blobs; then, set them back to utf8. This should fix the old double-encoded strings.
phsource
Thankfully I caught all of this in the development stage, so I have the blissful flexibility to drop, truncate and recreate the tables to test everything. That might come in handy for the the rest of the site, though... I have no idea if we have other data that is mis-encoded. Thanks for the tip on how to do that.
Alex JL