views:

102

answers:

1

Hi all,

I'm writing a function to convert MS Word-styled text into Adobe InDesign-formatted text (it uses a kind of XML to indicate styling). The text is pasted into a TinyMCE rich text editor, which then sends the HTML-formatted code to a php function.

I've tried this function to clean up the code once it reaches my conversion code:

$text = iconv("windows-1250", "UTF-8", $html);

When I use any 'special' kind of characters, things go wrong. £ signs, é (or any other accents), and a variety of 'curly' apostrophes/quote marks seem to break things. For example, if I try to convert a £ sign, the code returns \u0141, but I get the Ł symbol displayed onscreen when the function returns.

Does anybody know what I can do to prevent Word's weird characters breaking everything I'm doing?

Thanks, Matt

A: 

I seem to have fixed this. I was using escape() to pass the values, but replaced this with encodeURIComponent() instead (and removed the iconv() call in my php code), which seems to have fixed it.

Matt Andrews