views:

20

answers:

1

All of our tables are currently set with a LATIN1 character set. A user is currently capable of putting together unicode sequences on the client and trying to embed them into our application. What's the best way to discard all Unicode characters from hitting our database? Even better, that's the best way to ensure that only characters based on a LATIN1 character set are getting inserted into the db?

+3  A: 

There are a couple things you can do.

First, you could add the accept-charset attribute to your form tags like so:

<form accept-charset="ISO-8859-1">

Unfortunately IE doesn't support this very well (of course), so you can use iconv to convert the data once you have it on your server. The iconv() function can convert from one charset to another. You can specify whether you want incompatible characters translated, ignored, or have a notice thrown.

hobodave