Hello
I got a bunch of .DOC documents. I'm not even positive they are Word documents, but even if they are, I need to open and parse them with eg. Python to extract information from them.
Problem is, I couldn't figure out how they were encoded: UltraEdit's Conversion function wouldn't correct the text no matter which encoding I tried. OpenOffice 3.2 also failed displaying the contents correctly (guessing Windows-1252).
Here's an example, hoping that someone knows what pagecode it is:
"lÕAssemblŽe gŽnŽrale" instead of "l'Assemblée générale"
Thank you for any tip.