ansaurus

Question

Regular Expression to validate the field: field must contain atleast 2 AlphaNumeric characters

Answer 1

+5 A:

[a-zA-Z0-9].*[a-zA-Z0-9]

The easy way: At least two alnum's anywhere in the string.

Answer to comments
I never did (nor intended to do) any benchmarking. Therefore - and given that we know nothing of OP's environment - I am not one to judge whether a non-greedy version ([a-zA-Z0-9].*?[a-zA-Z0-9]) will be more efficient. I do believe, however, that the performance impact is totally negligible :)

jensgram 2010-01-20 08:46:38

better do `.*?` - this will keep the regex from backtracking, and probably faster (it will find the minimal substring instead of the maximal).

Kobi 2010-01-20 08:52:31

thank uuuuuuuuuu very much its working fineeeeeeeeeeeeeeeeeeeeee...I want to learn how to write Regular Expression... so if u hve any links r books could u forward....Thank u

VinnaKanna 2010-01-20 08:56:03

@Kobi Thank you. Edited my answer.@OP http://www.regular-expressions.info/

jensgram 2010-01-20 08:56:52

@OP: Please mark the answer as accepted if it solves your problem (as it seems to do). You should get off your 0% accept rate or pretty soon people will stop answering questions...

Tim Pietzcker 2010-01-20 09:36:23

Martin Fowler wrote, "If you make an optimization and don't measure to confirm the performance increase, all you know for certain is that you've made your code harder to read." People have begun to repeat the folk myth that `.*?` avoids backtracking, which is true in a sense, but it goes through the same process in the other direction, which you might call forwardtracking.

Greg Bacon 2010-01-20 14:16:54

@kobi, you clearly don't understand how an NFA regex engine works, in this case lazy will be slower for almost every possible target string.

Paul Creasey 2010-01-20 16:08:46

@Paul - That's very possible. But my test show you're wrong here, in the general case the greedy is much slower. I'll be happy to hear you're explanation though, I'm eagered to learn.

Kobi 2010-01-20 21:22:51

Answer 2

A:

As simple as

'\w.*\w'

Paul Creasey 2010-01-20 08:46:53

`\w` includes the underscore which is ... not quite alphanumeric.

Joey 2010-01-20 08:48:05

Yup. Although `\w` may match `_` (depending on the implementation, I assume).

jensgram 2010-01-20 08:48:36

Hmm, at least I'm not the only one to see this :)

jensgram 2010-01-20 08:49:05

Answer 3

+2 A:

I would probably use this regular expression:

[a-zA-Z0-9][^a-zA-Z0-9]*[a-zA-Z0-9]

Gumbo 2010-01-20 09:02:34

it would fail for this : ab(

Aadith 2010-01-20 09:04:27

@Aadith: Not it definitely wouldn’t. The non-alphanumeric expression in the middle is quantified with zero or more (`*`).

Gumbo 2010-01-20 09:10:07

the first [a-zA-Z0-9] would match 'a'.[^a-zA-Z0-9]* would match nothing.the second [a-zA-Z0-9] would match 'b'.what about anything following the second alphanumeric char?(which is '(' in the example i gave in my pre comment)

Aadith 2010-01-20 09:34:47

@Aadith: Since there is no assertion about the start and the end of the string (marked with `^` and `$` respectively), the match can be at any position in the string.

Gumbo 2010-01-20 09:41:32

@Gumbo exactly. that is precisely what I am saying. There could be more characters following the second alphanumeric character..which this particular regex does not take into consideration.

Aadith 2010-01-20 11:18:31

@Aadith: Have you heard of the difference between matching and searching?

Tim Pietzcker 2010-01-20 11:24:10

@Aadith: The asker already used a regular expression with `^` and `$`. So I supposed that he isn’t using a language that requires regular expressions to describe the whole string (like XML Schema) but allows regular expression to describe just parts of the string.

Gumbo 2010-01-20 11:46:36

@Tim : No. Can you tell whats the difference?

Aadith 2010-01-20 15:03:54

@Aadith: Matching checks whether a regex matches a string **entirely** (e.g. `re.match()` in Python); searching checks whether a regex matches a **part of** a string (`re.search()`). In some languages there is only a search command, in that case, you can force to match the entire string by surrounding the regex with `^` and `$` or (better) `\A` and `\Z`.

Tim Pietzcker 2010-01-20 17:53:15

Answer 4

A:

How broad is your definition of alphanumeric? For US ASCII, see the answers above. For a more cosmopolitan view, use one of

[[:alnum:]].*[[:alnum:]]

or

[^\W_].*[^\W_]

The latter works because \w matches a "word character," alphanumerics and underscore. Use a double-negative to exclude the underscore: "not not-a-word-character and not underscore."

Greg Bacon 2010-01-20 14:24:07

Answer 5

A:

In response to a comment, here's a performance comparison for the greedy [a-zA-Z0-9].*[a-zA-Z0-9] and the non-greedy [a-zA-Z0-9].*?[a-zA-Z0-9].

The greedy version will find the first alphanumeric, match all the way to the end, and backtrack to the last alphanumeric, finding the longest possible match. For a long string, it is the slowest version. The non greedy version finds the first alphanumeric, and tries not to match the following symbols until another alphanumeric is found (that is, for every letter it matches the empty string, tries to match [a-zA-Z0-9], fails, and matches .).

Benchmarking (empirical results):
In case the alphanumeric are very far away, the greedy version is faster (even faster than Gumbo's version).
In case the alphanumerics are close to each other, the greedy version is significantly slower.

The test: http://jsbin.com/eletu/4
Compares 3 versions:

[a-zA-Z0-9].*?[a-zA-Z0-9]
[a-zA-Z0-9][^a-zA-Z0-9]*[a-zA-Z0-9]
[a-zA-Z0-9].*[a-zA-Z0-9]

Conclusion: none. As always, you should check against typical data.

Kobi 2010-01-20 21:19:07

ansaurus

tags:

views:

answers:

Regular Expression to validate the field: field must contain atleast 2 AlphaNumeric characters

related questions