Hi, I'm trying to understand how I should work with characters in URLs
, this because I'm building a site where the user can store content and go to the content's page by digiting it's name in the URL
.
so, something like Wikipedia
or Last.FM
website.
I see in the site, user can write something likehttp://it.wikipedia.org/wiki/Trentemøller
and the page of the artist can reached.
after the page is loaded, if I copy the URL i see written as:http://it.wikipedia.org/wiki/Trentemøller
but if I paste it into a text editor, it will be pasted as
http://it.wikipedia.org/wiki/Trentem%C3%B8ller
so the char ø
is pasted as %C3%B8
of course the same is for URLs like this (the page of the artist Takeshi Kobayashi)
http://www.last.fm/music/小林武史
http://www.last.fm/music/%E5%B0%8F%E6%9E%97%E6%AD%A6%E5%8F%B2
If I digit the first or the second, the page works in any case, why?
I think I should do something with the .htacces
and mod_rewrite
but I'm not sure, are the special chars automatically converted to the url special chars?
and then, how can I do to let PHP do the right query with the content name?
if I have a table like
table_users
- username
- age
- height
- weight
- sex
- email
- country
I'm able with mod_rewrite
to write an address like http://mysite.com/user/bob
to get the username
bob from the table_users
but what about http://mysite.com/user/小林武史
?
here I show a simple example of what I think to do:
#.htaccess
RewriteEngine On
RewriteRule ^(user/)([a-zA-Z0-9_+-]+)([/]?)$ user.php?username=$2
<?php
// this is the page user.php
// this is the way I use to get the url value
print $_REQUEST["username"];
?>
this works, but it's limited to [a-zA-Z0-9_+-], how to be more compatible with all chars like the others without loss too much security?
Did someone know some way to avoid troubles?