views:

58

answers:

2

Perhaps I don't have enough of an understanding of this yet, so I'm looking for a little direction.

All of our tables show a collation of latin1_swedish_ci. Here's what I see in the mysql variables:

collation connection utf8_general_ci
(Global value) latin1_swedish_ci
collation database latin1_swedish_ci
collation server latin1_swedish_ci

Now, we see utf8 (or, at least, foreign language content) stored in the db pretty frequently, and it renders correctly. Does the collation not matter for this?

Using something like php addslashes() on user input - is this sufficient? Or, does this leave an injection opportunity?

EDIT: So, looking at the full set of collation / charset settings, at least in phpmyadmin, I see:

character set client    utf8
(Global value)  latin1
character set connection    utf8
(Global value)  latin1
character set database  latin1
character set filesystem    binary
character set results   utf8
(Global value)  latin1
character set server    latin1
character set system    utf8
character sets dir  /usr/share/mysql/charsets/
collation connection    utf8_general_ci
(Global value)  latin1_swedish_ci
collation database  latin1_swedish_ci
collation server    latin1_swedish_ci
+2  A: 

The collation does only describe rules for comparing characters of a certain character set. One rule could be that a is equal A, b is equal B, etc. or that ß is equal to ss, ä is equal to ae, etc.

And for an explicit escaping of strings for MySQL, use mysql_real_escape_string. This function does in opposite to addslashes and mysql_escape_string take the actual character encoding of the connection into account.

But you need to set the character encoding of the connection with mysql_set_charset. Because otherwise a change will not be recognized (see C API Functions Description – mysql_real_escape_string()):

If you need to change the character set of the connection, you should use the mysql_set_character_set() function rather than executing a SET NAMES (or SET CHARACTER SET) statement. mysql_set_character_set() works like SET NAMES but also affects the character set used by mysql_real_escape_string(), which SET NAMES does not.

Gumbo
+1 this is the correct answer. although adodb and pdo use mysql_escape_string() and everyone loves parametrized queries...
Rook
mysql_real_escape_string doing it's special job only if mysql_set_charset() being used. otherwise it will act as mysql_escape_string
Col. Shrapnel
@Col. Shrapnel: Didn’t know that. But you’re right, see http://dev.mysql.com/doc/refman/5.0/en/mysql-real-escape-string.html.
Gumbo
it could be epic fail if utf8 weren't be a standard de-facto :) But it is, and it doesn't require any special attention from real escaping. Thus, mysql_real_escape_string is just overkill in most cases. Not to mention prepared statements which do not being affected by this setting at all.
Col. Shrapnel
A: 

All of our tables show a collation of latin1_swedish_ci
foreign language content renders correctly

There is something wrong with your database.
It will be either unable to store non-latin characters or unable to order/filter database contents properly.

To store foreign characters, utf8 charset should be set for the tables. As well as connection charset.

Using something like php addslashes() on user input - is this sufficient?

addslashes is sufficient, if your charsets latin1 and utf8 only. But the rest is wrong.

  1. addslashes() or other escaping function do not help alone! It works only with quotes around escaped data. Thus, it should be not just "Using something like addslashes()" but "Using something like addslashes() for quoted strings and type casting for numbers"
  2. Not for user input! Escaping is not for sanitizing! It's just for proper formatting of the query. Any query. With any data. Not only user input, as everyone in this poor world thinks, but for any data (that goes to the query as quoted strings).
Col. Shrapnel
@Col. Shrapnel: Thanks for the feedback. I included my full charset / collation settings above, not sure what's incorrect. Also, not sure what the settings from my php processes would be, will see if I can figure that out.
Neil