tags:

views:

205

answers:

3

hey guys

im somehow confused in using proper functions to escape and create a slug

i used this :

$slug_title = mysql_real_escape_string()($mtitle);

but someone told me not to use it and use urlencode()

which one is better for slugs and security

as i can see in SO , it inserts - between words :

http://stackoverflow.com/questions/941270/validating-a-slug-in-django

thanx in advanced

+5  A: 

Using either MySQL or URL escaping is not the way to go.

Here is an article that does it better:

function toSlug($string,$space="-") {
    if (function_exists('iconv')) {
        $string = @iconv('UTF-8', 'ASCII//TRANSLIT', $string);
    }
    $string = preg_replace("/[^a-zA-Z0-9 -]/", "", $string);
    $string = strtolower($string);
    $string = str_replace(" ", $space, $string);
    return $string;
}

This also works correctly for accented characters.

Thomas
problem is for arabic and mb languages this function wont work and changes all characters to -
Mac Taylor
+3  A: 

Here's an alternative solution that is very similar to the one above but doesn't rely on iconv():

function Slug($string)
{
    return strtolower(trim(preg_replace(array('~[^0-9a-z]~i', '~-+~'), '-', preg_replace('~&([a-z]{1,2})(acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i', '$1', htmlentities($string, ENT_QUOTES, 'UTF-8'))), '-'));
}

$user = 'Alix Axel';
echo Slug($user); // alix-axel

$user = 'Álix Ãxel';
echo Slug($user); // alix-axel

$user = 'Álix----_Ãxel!?!?';
echo Slug($user); // alix-axel

Source: Alix Axel in this SO thread

John Conde
+1  A: 

mysql_real_escape_string() has different purpose than urlencode() which both aren't appropriate for creating a slug.

A slug is supposed to be a clear & meaningful phrase that concisely describes the page.

mysql_real_escape_string() escapes dangerous characters that can change the purpose of the original query string.

urlencode() escapes invalid URL characters with "%" followed by 2 hex digits that represents their code (e.g. %20 for space). This way, the resulting string will not be clear & meaningful, because of the unpleasant characters sequences, e.g. http://www.domain.com/bad%20slug%20here%20%3C--

Thus any characters which may be affected by urlencode() should be omitted, except for spaces that are usually replaced with -.

Dor