ansaurus

Question

Need a simple RegEx to find a number in a single word

Answer 1

+3 A:

If you want it greater than 0, use this regex:

/([1-9][0-9]*)/

This'll work as long as the number doesn't have leading zeros (like '03').

However, I recommend just using a simple [0-9]+ regex, and validating the number in your actual site code.

yjerem 2008-10-31 05:02:00

Answer 2

A:

string testString = @"/page/100";
string pageNumber = Regex.Match(testString, "/page/([1-9][0-9]*)").Groups[1].Value;

If not matched pageNumber will be ""

CMS 2008-10-31 05:05:37

Answer 3

A:

i'm not after any code, like perl, c#, etc.

this is the text that will go into any RegEx expression.

i also do not need to check for the word 'page' or the '/' character, etc.

i just wanted to make sure the string/variable 'currentPage' is only a number, using regex.

how do these two differ, please?

#[1-9]+#    vs    /([1-9][0-9]*)/

Pure.Krome 2008-10-31 05:24:31

The first one will NOT match numbers with zeros in them (this includes numbers like '10'). The second one will match any number that doesn't start with a '0'.

yjerem 2008-10-31 05:32:19

[1-9]+ won't match 10 or 100 or 101 or 110. [1-9][0-9]* will.

eyelidlessness 2008-10-31 05:32:38

[1-9]+ means any number > 0, one or more times. [1-9][0-9]* would be any number > 0 one time, followed by any number zero or more times.

Jay 2008-10-31 05:33:03

cheers guys. i'll roll with the 2nd one and give da points to him/her/it.

Pure.Krome 2008-10-31 05:33:31

Answer 4

+7 A:

/^[1-9][0-9]*$/

Problems with other answers:

/([1-9][0-9]*)/ // Will match -1 and foo1bar
#[1-9]+# // Will not match 10, same problems as the first
[1-9] // Will only match one digit, same problems as first

eyelidlessness 2008-10-31 05:38:52

cheers mate! this answer is full of aweseomsauce.

Pure.Krome 2008-10-31 07:27:49

[0-9]+ will actually match 10, but will also match 0, which was unwanted.

Ben Doom 2008-10-31 13:59:11

Answer 5

A:

While Jeremy's regex isn't perfect (should be tested in context, against leading characters and such), his advice is good: go for a generic, simple regex (eg. if you must use it in Apache's mod_rewrite) but by any means, handle the final redirect in server's code (if you can) and do a real check of parameter's validity there.

Otherwise, I would improve Jeremy's expression with bounds: /\b([1-9][0-9]*)$/
Of course, a regex cannot provide a check against any max int, at best you can control the number of digits: /\b([1-9][0-9]{0,2})$/ for example.

PhiLho 2008-10-31 06:17:18

i'm happy that the number provided COULD be larger than an int.max.I really just wanted to check that the string is a number greater than zero and is a number. if it's larger than int.max ... meh. i'm not too worried.

Pure.Krome 2008-10-31 07:26:55

Answer 6

A:

This will match any string such that, if it contains /page/, it must be followed by a number, not consisting of only zeros.

^(?!.*?/page/([0-9]*[^0-9/]|0*/))

(?! ) is a negative look-ahead. It will match an empty string, only if it's contained pattern does not match from the current position.

MizardX 2008-10-31 14:02:44

cheers .. but i do not want to check for the existence of the word 'page'. i just need to make sure the string i have is a number (using regex).

Pure.Krome 2008-10-31 14:05:20

Answer 7

A:

This one would address your specific problem. This expression

/\/page\/(0*[1-9][0-9]*)/ or "Perl-compatible" /\/page\/(0*[1-9]\d*)/

should capture any non-zero number, even 0-filled. And because it doesn't even look for a sign, - after the slash will not fit the pattern.

The problem that I have with eyelidlessness' expression is that, likely you do not already have the number isolated so that ^ and $ would work. You're going to have to do some work to isolate it. But a general solution would not be to assume that the number is all that a string contains, as below.

/(^|[^0-9-])(0*[1-9][0-9]*)([^0-9]|$)/

And the two tail-end groups, you could replace with word boundary marks (\b), if the RE language had those. Failing that you would put them into non-capturing groups, if the language had them, or even lookarounds if it had those--but it would more likely have word boundaries before lookarounds.

Full Perl-compatible version:

/(?<![\d-])(0*[1-9]\d*)\b/

I chose a negative lookbehind instead of a word boundary, because '-' is not a word-character, and so -1 will have a "word boundary" between the '-' and the '1'. And a negative lookbehind will match the beginning of the string--there just can't be a digit character or '-' in front.

You could say that the zero-width assumption ^ is just one of the cases that satisfies the zero-width assumption (?<![\d-]).

Axeman 2008-10-31 14:04:41

"likely you do not already have the number isolated" - except Pure.Krome specifically said that it is available as the variable 'currentPage'. And also said: "i also do not need to check for the word 'page' or the '/' character, etc."

eyelidlessness 2008-10-31 19:48:36

correct eyelidlessness. I'm not sure why people can't understand what i said? :)it's pretty simple (my question) : how do i check if a word is a number greater than zero. that's it.anyways .. question has been answered :)

Pure.Krome 2008-11-01 00:13:57

ansaurus

tags:

views:

answers:

Need a simple RegEx to find a number in a single word

related questions