I'm adding data from a csv file into a database. If I open the CSV file, some of the entries contain bullet points - I can see them. file says it is encoded as ISO-8859.
$ file data_clean.csv 
data_clean.csv: ISO-8859 English text, with very long lines, with CRLF, LF line terminators
I read it in as follows and convert it from ISO-8859-1 to UTF-8, which my database requires.
    row = [unicode(x.decode("ISO-8859-1").strip()) for x in row]
    print row[4]    
    description = row[4].encode("UTF-8")
    print description
This gives me the following:
'\xa5 Research and insight \n\xa5 Media and communications'
¥ Research and insight 
¥ Media and communications 
Why is the \xa5 bullet character converting as a yen symbol?
I assume because I'm reading it in as the wrong encoding, but what is the right encoding in this case? It isn't cp1252 either.
More generally, is there a tool where you can specify (i) string (ii) known character, and find out the encoding?