How should I format URLs with special/international characters?
Currently I try to make URLs "look good", so that:
www.myhost.com/this is a test, do you know how?
is converted to:
www.myhost.com/this_is_a_test_do_you_know_how
I know some international letters could be converted (ü = ue, æ = ae, å = aa), some characters could be removed. I general I try to make the URL look "good", but is that stupid?
But what do I do with chinese, japanese, arabian letters that has nothing to do with our western ASCII format?
I really don't like the idea of rewriting the URL with hex codes, so right now I just use my internal unique ID if the url contains too many "non convertable" characters.