ansaurus

Question

regex for alphanumeric, but at least one character

Answer 1

+1 A:

^[0-9]*[A-Za-z][0-9A-Za-z]*$

is the regex that will do what you're after. The ^ and $ match the start and end of the word to prevent other characters. You could replace the [0-9A-z] block with \w, but i prefer to more verbose form because it's easier to extend with other characters if you want.

Add a regular expression validator to your asp.net page as per the tutorial on MSDN: http://msdn.microsoft.com/en-us/library/ms998267.aspx.

Dexter 2009-06-27 02:29:33

this could contain all numbers, no?

akf 2009-06-27 02:31:51

missed the extra bit about needing at least one character - i've edited to force at least one character

Dexter 2009-06-27 02:34:46

If you're not capturing, you don't need () except for grouping, and [] makes a fine group. A-z is more than just letters. You probably meant [0-9A-Za-z]*[A-Za-z][0-9A-Za-z]*

Roger Pate 2009-06-27 02:38:30

thanks for the feedback r pate

Dexter 2009-06-27 02:47:50

Answer 2

+1 A:

^\w*[\p{L}]\w*$

This one's not that hard. The regular expression reads: match a line starting with any number of word characters (letters, numbers, punctuation (which you might not want)), that contains one letter character (that's the [\p{L}] part in the middle), followed by any number of word characters again.

If you want to exclude punctuation, you'll need a heftier expression:

^[\p{L}\p{N}]*[\p{L}][\p{L}\p{N}]*$

And if you don't care about Unicode you can use a boring expression:

^[A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*$

Welbog 2009-06-27 02:31:00

You need ^ and $ swapped.

Roger Pate 2009-06-27 02:34:31

@R. Pate: Thanks. It wouldn't be code if it didn't have bugs.

Welbog 2009-06-27 02:35:51

Answer 3

+7 A:

^\d*[a-zA-Z][a-zA-Z0-9]*$

Basically this means:

Zero or more digits;
One alpha character;
Zero or more alphanumeric characters.

Try a few tests and you'll see this'll pass any alphanumeric string where at least one non-numeric character is required.

The key to this is the \d* at the front. Without it the regex gets much more awkward to do.

cletus 2009-06-27 02:33:50

"One alphanumeric character" should read "One alphabetic character" or similiar: that part of the regex does not include digits.

Roger Pate 2009-06-27 02:40:48

The \d* is clever!

John Kugelman 2009-06-27 02:44:00

@R.Pate: quite right. Fixed. Thanks.

cletus 2009-06-27 02:47:01

@John: not only clever, but efficient! The \d* avoids potential O(N**2) backtracking ... I think.

Stephen C 2009-07-29 05:09:51

Great solution, cletus. You can make the regex a little shorter if you use the case insensitive flag like so: /^\d*[a-z][a-z0-9]*$/i

pr1001 2009-08-08 19:46:11

Answer 4

+3 A:

^[\p{L}\p{N}]*\p{L}[\p{L}\p{N}]*$

Explanation:

[\p{L}\p{N}]* matches zero or more Unicode letters or numbers
\p{L} matches one letter
[\p{L}\p{N}]* matches zero or more Unicode letters or numbers
^ and $ anchor the string, ensuring the regex matches the entire string. You may be able to omit these, depending on which regex matching function you call.

Result: you can have any alphanumeric string except there's got to be a letter in there somewhere.

\p{L} is similar to [A-Za-z] except it will include all letters from all alphabets, with or without accents and diacritical marks. It is much more inclusive, using a larger set of Unicode characters. If you don't want that flexibility substitute [A-Za-z]. A similar remark applies to \p{N} which could be replaced by [0-9] if you want to keep it simple. See the MSDN page on character classes for more information.

The less fancy non-Unicode version would be

^[A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*$

John Kugelman 2009-06-27 02:35:22

\p{N} would match Unicode numbers, such as ① ② . This may come as surprise for some.

J-16 SDiZ 2009-06-27 02:40:51

and \p{L} would match Chinese letters too. So, think twice before using..

J-16 SDiZ 2009-06-27 02:41:33

Turning the first part, [\p{L}\p{N}]*, into \p{N}* simplifies the explanation and prevents some back tracking.

Roger Pate 2009-06-27 02:43:20

Answer 5

A:

^[0-9]*[a-zA-Z][a-zA-Z0-9]*$

Can be

any number ended with a character,
or an alphanumeric expression started with a character
or an alphanumeric expression started with a number, followed by a character and ended with an alphanumeric subexpression

eKek0 2009-06-27 02:52:08

Isn't this the same as the top answer? Just with \d replaced with [0-9].

Evan Fosmark 2009-06-27 18:29:24

Well, yes. I didn't see that at the time of posting

eKek0 2009-06-28 00:46:34

Answer 6

+1 A:

Most answers to this question are correct, but there's an alternative, that (in some cases) offers more flexibility if you want to change the rules later on:

^(?=.*[a-zA-Z].*)([a-zA-Z0-9]+)$

This will match any sequence of alphanumerical characters, but only if the first group also matches the whole sequence. It's a little-known trick in regular expressions that allows you to handle some very difficult validation problems.

For example, say you need to add another constraint: the string should be between 6 and 12 characters long. The obvious solutions posted here wouldn't work, but using the look-ahead trick, the regex simply becomes:

^(?=.*[a-zA-Z].*)([a-zA-Z0-9]{6,12})$

Philippe Leybaert 2009-06-28 19:35:09

What purpose does the second .* have?

Peter Boughton 2009-06-28 19:40:48

None :-) It can be safely ommitted.

Philippe Leybaert 2009-06-28 19:58:50

ansaurus

tags:

views:

answers:

regex for alphanumeric, but at least one character

related questions