views:

186

answers:

4
  • What restrictions should I impose on usernames? why?
  • What restrictions should I not impose on usernames? why?

P.S. db is via best-practice PDO so no risk of sql injection

Thanks

+4  A: 

I've never seen the point in adding restrictions to usernames. If your code is resistant to sql injection attacks then let them put in anything they want.

The only restriction I'd add is a max length one so that it can be stored in a DB table

Let them use any Unicode character in their username. Adding restrictions on the allowed characters will probably just annoy people using a non-ascii language.

Glen
So I should let users register with username admin for example? :) exactly the point behind this question :D
Chris
why not. just because the username is "admin" it doesn't make them an administrator. Stackoverflow for example has 2 users called "admin" and 3 called "Admin". If you have some usernames that you don't want used, don't create a code restriction, just create the users yourself and you're done. its a quick and simple fix
Glen
It's more to avoid confusions/mistakes. Users shouldn't be misled into thinking that a non-admin is an admin, or they might direct inappropriate questions to him like resetting passwords, etc. Social engineering is part of the overall security, just my 2 cents.
Chris
+1 though by the way, for the db size point, good point indeed :)
Chris
@Chris, one thing to you could do is copy stackoverflow. Admins get an extra symbol next to their username.
Glen
Hmm, good point, could possibly avoid the confusion.
Chris
+5  A: 

Depends on many things, for instance, if the users are going to have their own URL, you want to be careful that someone who creates the username "%41llan" doesn't clash with the user called "Allan", while allowing forward-slash may cause problems. Look out for those sorts of constraints.

Artelius
good point with "%41llan". Although users don't need their own urls, I'm wondering how do you fix that?
Chris
%41llan's URL would be correctly encoded at %2541llan. It's just another escaping issue, same as SQL or HTML injection.
bobince
Depends on how you choose to treat it. I'm working on a site where I didn't want "ugly" URLs, so I simply stripped out characters like that. That opens up the problem of two users with the same URL so you need to check that new users don't clash with existing ones.
Artelius
+2  A: 

SQL injection protection is a must, but that should probably be in your code, not in username restrictions. Certain characters should definitely be escaped, like \, %, etc.

It will on what kind of site you're running, but I think some obscene word restrictions would make your site look more professional no matter what. If someone sees that people are allowed to go around with "EXPLETIVE" as they're username, your site will look childish. Its like allowing teenagers to run rampid in your book store IMHO. You probably don't need to get much more picky than that, although its completely up to you.

This is slightly off topic, but as another piece of username advice, a great feature of any website is allowing users to change they're username over time. You can just have a number as a primary key, and allowing them to do this can save a lot of whining and people creating new accounts because they wanted to change their username. :D

CrazyJugglerDrummer
any automated bad word filter is doomed to failure. You may actually create the problem you're trying to prevent as people find it a game to circumvent the filter. As a general rule, blacklists never work.
rmeador
I have to admit that bad words was one of the things I was after. The audience isn't going to be large, but I'm trying to keep it clean but without being too overzealous.
Chris
one fix for this would be to allow the registration, but flag it for a moderator to review. This would allow a user to register "scunthorpe" or the disliked "expertsexchange", but still allow you to spot dodgy names. In fact even easier would be to do a periodic (daily) trawl through the usernames in the DB and spot any ones you don't like.
Glen
+1 for the "experts exchange", took me a while to get the pun, shows how slow I am at this time of day
Chris
Expletives are best dealt with by post-moderation.
UpTheCreek
+9  A: 

OK, so let's assume you're doing all your string-encoding tasks right. You've not got any SQL injections, HTML injections, or places where you're not URL-encoding something you should. So we don't need to worry about characters like "<&%\ being magic in some contexts. And you're using UTF-8 for everything so all of Unicode is in play. What other reasons are there to limit usernames?

To start with, all control characters, for sanity. There is no reason to have characters U+0000 to U+001F ot U+007F to U+009F in a username.

Next, deny or normalise unexpected whitespace. You may want to allow a space in a username, but you almost certainly don't want to allow leading spaces, trailing spaces, or more than one space in a row. They may render the same in HTML, but are probably a user error that will confuse.

If you intend to allow that username to be used to login through HTTP Basic Authentication, you must disallow the : character, because the Basic Auth scheme encodes a ‘username:password’ pair with no escaping if there's a colon in the username or password. So at least one of the username and password must have the colon excluded, and it's better that that's the username because restricting people's choice of passwords is a much worse thing than usernames.

For Basic Authentication you may also want to disable all non-ASCII characters, as they are handled differently by different browsers. IE encodes them using the system codepage; Firefox encodes them using ISO-8859-1; Opera encodes them using UTF-8. Users should at least be warned before choosing non-ASCII names if HTTP Auth is going to be available, as actually using them will be very unreliable.

Next consider other Unicode control sequences, things like the bidi overrides and other characters listed there are unsuitable for use in markup. Probably you are going to end up putting them in markup and you don't want someone with an RLO in their name to turn a load of the text in your page backwards.

Also, if you allow Unicode do normalisation on the strings you get. Otherwise someone may have a username with a composed o-umlaut character ö, and wonder why they can't log in on a Mac, which by default would use a separate o character followed by combining umlaut. It's usual to normalise to the composed form NFC on the web. You may also want to do compatibility decompositions by using the form NFKC; this would allow a user Chris to log in from a Japanese keyboard in fullwidth romaji mode typing Chris. These are general issues it is good to solve for all your webapp's input, but for identifiers like usernames it can be more critical to get right.

Finally, make sure the length is OK to fit in the database without a silent truncation changing the name, especially if you are storing as UTF-8 bytes which you don't want to get snipped halfway through a byte sequence. Username truncations can also be a security issue in general.

If you are using usernames as a unique means of identification, you have much more to worry about: the already-mentioned problem of lookalikes such as Сhris (with a Cyrillic Es С). There are too many of these for you to handle reasonably; either restrict to ASCII or have an additional means of identifying users. (Or don't care, like SO doesn't; when I can easily call myself Chris anyway I have no need to call myself С-hris.)

bobince
Thank you for the epic post :) It confirmed my suspension that restrictions have a valid reason.
Chris

related questions