ansaurus

Question

Regular Expression: Allow letters, numbers, and spaces (with at least one letter or number)

Answer 1

+5 A:

^[ _]*[A-Z0-9][A-Z0-9 _]*$

You can optionally have some spaces or underscores up front, then you need one letter or number, and then an arbitrary number of numbers, letters, spaces or underscores after that.

Something that contains only spaces and underscores will fail the [A-Z0-9] portion.

Daniel LeCheminant 2009-02-23 00:46:03

Doesn't work on strings like 'A', '9', 'AA', 'AA', '99','_999_888', 'AAA_SS S'

Renaud Bompuis 2009-02-23 01:35:06

I don't believe you have tried this, Renaud, otherwise you would know it was false.

paxdiablo 2009-02-23 01:37:35

I have, and you just need to look at the regex (the current anyway) to see that it would match a simple string like 'A'

Renaud Bompuis 2009-02-23 01:40:41

That matches 'A' fine, Renaud. * matches 0 or more, meaning that the 'A' would go to the [A-Z0-9] part, and the other two would just be 0.

Chris Lutz 2009-02-23 01:43:18

And why exactly wouldn't Daniel's (or mine for that matter) match it? Do you not know what the "*" means in REs?

paxdiablo 2009-02-23 01:43:39

Sorry, I misread the question that it should require both a letter and a digit and I was wrong.

Renaud Bompuis 2009-02-23 01:51:23

Answer 2

+7 A:

You simply need to specify your current RE, followed by a letter/number followed by your current RE again:

^[A-Z0-9 _]*[A-Z0-9][A-Z0-9 _]*$

Since you've now stated they're Javascript REs, there's a useful site here where you can test the RE against input data.

If you want lowercase letters as well:

^[A-Za-z0-9 _]*[A-Za-z0-9][A-Za-z0-9 _]*$

paxdiablo 2009-02-23 00:48:01

On a sting like " ___ ___ ___", one that contains none of the required numbers or letters, The regex you tried will try many combinations that cannot work. I think it will try about n combinations. Daniel L's answer works better.

TokenMacGuy 2009-02-23 01:12:33

That would be a pretty unsophisticated RE engine, @token. Most of the ones I've seen have optimizations to look for specific values first, such as ^, $ and [A-Z0-9]. Backtracking searches only become necessary after all these other conditions are satisfied. Not satisfied means no match.

paxdiablo 2009-02-23 01:27:56

Doesn't work on strings like 'AA', '99', '_999_888', ...

Renaud Bompuis 2009-02-23 01:28:19

Ok, working fine in PHP, but in JS everything is failing the regex. Is there a mistake in my JS here?if (!name.match(/^[A-Z0-9 _]*[A-Z0-9][A-Z0-9 _]*$/)){ //do something }

makeee 2009-02-23 01:28:47

In addition, it's easy to come up with a test string that gives an RE worst case performance. If speed is the issue, you would not be using REs for this at all - you'd use a simple character scanner.

paxdiablo 2009-02-23 01:29:31

See http://www.regular-expressions.info/javascriptexample.html for a tester you can use for JS.

paxdiablo 2009-02-23 01:32:24

Sorry, I misread the question that it should require both a letter and a digit and I was wrong.

Renaud Bompuis 2009-02-23 01:55:34

[A-Z0-9]*[A-Z0-9][A-Z0-9]* doesn't seem to work in JS (tried it in the regex tester linked above). "makeee" passed [A-Z0-9] just fine, but not the full regex pattern. Any ideas?

makeee 2009-02-23 02:10:36

1) There are no spaces or underscores (or ^/$ either) in that RE you just posted. What is the EXACT pattern and search string you're using?

paxdiablo 2009-02-23 02:12:19

Regexp of ^[A-Z0-9 _]*[A-Z0-9][A-Z0-9 _]*$ and Subject string of AA works fine. makeee as a subject string fails because it's lowercase. If you want lowercase, see update.

paxdiablo 2009-02-23 02:15:01

Sorry, "makeee" fails both ^[A-Z0-9 _]*[A-Z0-9][A-Z0-9 _]*$ and ^[A-Z0-9]*[A-Z0-9][A-Z0-9]*$ (also fails if (^/$ are removed).

makeee 2009-02-23 02:16:51

ah ok, thanks Pax

makeee 2009-02-23 02:17:22

Also, if you're just wanting to find out if there's a match (rather than getting the matches), I'd use [if (name.search(/^[A-Z0-9 _]*[A-Z0-9][A-Z0-9 _]*$/) >= 0){ //found one }].

paxdiablo 2009-02-23 02:21:35

Answer 3

+3 A:

You can use a lookaround:

^(?=.*[A-Za-z0-9])[A-Za-z0-9 _]*$

It will check ahead that the string has a letter or number, if it does it will check that the rest of the chars meet your requirements. This can probably be improved upon, but it seems to work with my tests.

UPDATE:

Adding modifications suggested by Chris Lutz:

^(?=.*[^\W_])[\w ]*$/

gpojd 2009-02-23 01:08:47

You shouldn't use \s. He said he wants spaces, but didn't mention tabs. But while we're at it, why not use [^\W_] instead of [A-Za-z0-9]?

Chris Lutz 2009-02-23 01:25:24

Thanks. I replaced the part of the regex with something more explicit. Thanks for catching that.

gpojd 2009-02-23 01:28:20

Doesn't work on strings like 'A', '9', 'AA', 'AA', '99','_999_888', 'AAA_SS S'

Renaud Bompuis 2009-02-23 01:36:29

Renaud, it does in perl. What are you checking this with? perl -le '("A" =~ /(?=^.*[A-Za-z0-9])[A-Za-z0-9 _]*$/) ? print "match" : print "no match"'

gpojd 2009-02-23 01:41:15

Sorry, I misread the question that it should require both a letter and a digit and I was wrong.

Renaud Bompuis 2009-02-23 01:53:03

I would put the ^ anchor before the lookahead. It doesn't change the meaning, but it communicates your intention more clearly.

Alan Moore 2009-02-23 02:46:21

I agree Alan. The anchor ended up there when I was toying with it and I never put it back. I moved the anchor to the front.

gpojd 2009-02-23 03:01:45

Answer 4

+4 A:

To go ahead and get a point out there, instead of repeatedly using these:

[A-Za-z0-9 _]
[A-Za-z0-9]

I have two (hopefully better) replacements for those two:

[\w ]
[^\W_]

The first one matches any word character (alphanumeric and _, as well as Unicode) and the space. The second matches anything that isn't a non-word character or an underscore (alphanumeric only, as well as Unicode).

If you don't want Unicode matching, then stick with the other answers. But these just look easier on the eyes (in my opinion). Taking the "preferred" answer as of this writing and using the shorter regexes gives us:

^[\w ]*[^\W_][\w ]*$

Perhaps more readable, perhaps less. Certainly shorter. Your choice.

EDIT:

Just as a note, I am assuming Perl-style regexes here. Your regex engine may or may not support things like \w and \W.

EDIT 2:

Tested mine with the JS regex tester that someone linked to and some basic examples worked fine. Didn't do anything extensive, just wanted to make sure that \w and \W worked fine in JS.

EDIT 3:

Having tried to test some Unicode with the JS regex tester site, I've discovered the problem: that page uses ISO instead of Unicode. No wonder my Japanese input didn't match. Oh well, that shouldn't be difficult to fix:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

Or so. I don't know what should be done as far as JavaScript, but I'm sure it's not hard.

Chris Lutz 2009-02-23 01:32:52

+1 for the good suggestions. I updated my answer to show it your way.

gpojd 2009-02-23 01:53:32

Thanks! What range of characters does unicode cover? I would love to be able to support characters such as "à".

makeee 2009-02-23 01:59:36

Unicode covers basically everything, although you may have to do some more work to get webpages and programs to work with Unicode.

Chris Lutz 2009-02-23 02:14:24

Whether \w matches Unicode (by which I assume you mean non-ASCII) characters varies from one regex flavor to the next. If you want to match characters from the full Unicode range, you should do so explicitly.

Alan Moore 2009-02-23 02:35:06

@Alan - Okay. I think in terms of Perl, which is almost a de-facto standard against which other regex engines are measured, and I tend to expect PCRE-specific regex behaviors to work the way they do in Perl.

Chris Lutz 2009-02-23 02:42:34

Unfortunately, regex flavors are all over the place on this one. Check out this table; it's almost exactly half and half (with both PHP and JS in the wrong half). http://www.regular-expressions.info/charclass.html

Alan Moore 2009-02-23 02:56:01

Wow, Perl doesn't use PCRE. That's weird. Ah, well, PCRE will have to implement Unicode soon. It's almost impossible not to at this point.

Chris Lutz 2009-02-23 03:07:14

PCRE is a C library that, like many other flavors, used Perl's regex flavor as its model--in other words, Perl came first. PCRE supports Unicode if that option was selected when it was compiled, but \w, \d and such still only match ASCII characters.

Alan Moore 2009-02-23 03:45:59

Answer 5

A:

Someone intent on code injection would turn off javascript in their browser before injecting

daniel 2010-04-03 22:28:38

Answer 6

A:

for me @"^[\w ]+$" is working, allow number, alphabet and space, but need to type at least one letter or number.

SOFextreme 2010-09-28 04:12:48

ansaurus

tags:

views:

answers:

Regular Expression: Allow letters, numbers, and spaces (with at least one letter or number)

related questions