views:

150

answers:

2

We're letting users search a database from a single text input and I'm having difficulties in filtering some user supplied strings.

For example, if the user submits:

��������� lcd SONY

(Note the ?'s) I need to cancel the search.

I include the base64 encoded version of the above string wrapped up so that its easy run:

print(base64_decode("1MfLxc/RwdPHIGxjZCBTT05Z"));

I've ignored such inputs before but now (am not sure why) just realised the mysql database query is taking nearly forever to execute so this is now on high priority.

Another example to highlight that we are using utf-8 and mb_detect_encoding is not helping much:

print(base64_decode("zqDOm8+Fzr3PhM63z4HOuc6/IM+Bzr/Phc+HzyU="));
ΠΛυντηριο ρουχ�%

So:

  • how can I detect/filter these inputs?
  • how is this input being generated?
+1  A: 

You shouldn't be getting that althought if you really want to filter (which I don't reccommend) do a check or alphanumeric as while as "-.;" etc

You can use some of these functions to help you in the filtering process.

http://www.php.net/manual/en/function.ctype-alnum.php

AjayP
Does that handle something like "παπουτσια"? No.
zaf
A: 

If you execute these queries after creating the connection to mysql, it should handle utf-8 input and results just fine without spitting out ?'s.

mysql_query("SET character_set_client=utf8", $mysqlConn); mysql_query("SET character_set_connection=utf8", $mysqlConn); mysql_query("SET character_set_results=utf8", $mysqlConn);

(assuming the database is set to utf-8 and you don't mind not filtering them if they don't turn into ?'s)

(also assuming you are using mysql, other dbms probably have similar functions)

Jconnop
We are getting this input from the user before it even touches the database.
zaf