views:

103

answers:

4

If yes, then why are simple characters like dots (.) and underscores (_) not allowed in some services?

Generally, as a developer, what should we look out for when we make the decision of which characters should/shouldn't be allowed in usernames and other text fields?

An example: It is frustrating to see the username Senthil already exists. Okay fine, its a pretty common name where I live. But I am forced to slap on probably only some numbers. Instead, it'd be great if I can use my full name C. Senthil Kumar because it is less likely that someone else has already registered with that. But half the services I use don't allow a dot (.) in the username.

What are the issues that are stopping these services from accepting usernames that actually mean something to the users instead of some crazy stuff like csenthk08?

I am going to try to build one starting now, but I just don't want to fall into pitfalls that others have already encountered - for example, will allowing '(' character cause some problem in MySQL or Java or something and will the '<' character cause problems with any markup language? Is it so tough to avoid these problems? What about escaping/encoding them? I think it will be wishful thinking if I hope there is a universal way to encode strings so that the contents don't clash with ANY language/technology/markup.

+1  A: 

Restrictions on usernames is something only you, the designer, can ultimately control. What "restrictions" on usernames do you currently dislike? Character sets? Length? Ability for users to actually enter them?

All that really matters is that your users have a way to input the same information, and validate that information in a reliable manner. If you want to accept obscure things like the tab character, or extended character sets, you just need to be able to store them properly, and compare them to what is given in future login attempts.

JYelton
+2  A: 

The restrictions on something like a username at Google likley has nothing to do with the skill of Google's coders, and likely a limitation by all of the contexts that username might be used.

Will it be used in an email address? Will it be passed as part of a URL?

both of these put limitations on it outside of Google's control.

Neil N
+2  A: 
  • Don't need to worry about maybe one time outputting <> raw (XSS).
  • No URL encoding needed.
  • Pretty URLs.
  • A word name is much easier to read than ^^*(super&cool)*^^ or something stupid like that.
Coronatus
A: 

I'd guess in cases like Google's, it's because they're going to be using your username in:

  • Email addresses: \@ is technically legal, but annoying to parse, and + is used by Gmail for magic
  • URLs: Anything that isn't valid in a URL is easier to just disallow rather than encode (ones I know for sure would be problems: &, /, :, (space). I'm sure there's others.
  • XML: Same idea as URLs: <,>,&, etc.

In other cases, it's probably just because it's easier to say "only letters and numbers" than to think about all of the cases where it could be a problem.

Brendan Long