tags:

views:

39

answers:

1

Hi guys,

What would you recommend as best practice to handle or sanitize usernames ? In France, Spain, Greece, ... to even limit to latin alphabet, you'll get periods, dashes, apostrophes, ....

What we're doing is getting info from Facebook api. We get the username if the user have set it, otherwise we take the name and transform it to a username (to login later on, for instance). We get things like Clément, D'Aquinne, John M. March, Anne-Sophie Blass, ... you know the drill.

How have you handled this ?

+3  A: 

Full unicode support in usernames.

If you are worried about impersonations using unicode characters, you could display an automatically generated visual clue (much like the new user accounts here, but without being replaced by a user icon) next to the name.

If you still have to reduce to the ASCII range, you can use standard tools for Unicode text normalization. They work based on the various unicode equivalence principles: http://en.wikipedia.org/wiki/Unicode_equivalence

It may be worth a consideration to accept characters in the unicode range, for people who identify themselves with their diacritics, but not to accept any user account that has the same normalized form as an existing one. I.e. You can set up /user/clément, which will disallow the creation of /user/clement and /user/clëment.

relet