Hi guys, I was wondering if you could help me formulate a regular expression to match the following pattern?
Any arbitrary length string of numbers, which may or may not be preceded by 0x.
Hi guys, I was wondering if you could help me formulate a regular expression to match the following pattern?
Any arbitrary length string of numbers, which may or may not be preceded by 0x.
Could you specify the question more? How do you want to use the match? Which language/regexp implementation.
A simple one that will work with many languages regexp implementations is.
(?:0x)?\d+
Something like this:
\b(?:0x)?\d+\b
or this, if you want to exclude the optional "0x"
from the match:
(?:(?<=\b0x)|\b)\d+\b
The former is:
- a word boundary - "0x", optional - decimal digits, at least one - a word boundary
the latter would be:
- choose - either a position preceded by - a word boundary - "0x" - or a word boundary - decimal digits, at least one - a word boundary
The latter matches:
- 123456 - 0x123456
but not:
- 0y123456
To match hex digits (as your "0x" implies), use [0-9A-Fa-f]
in place of the "\d"
.
If you want the whole string to match (nothing else but the numbers):
^(0x)?[0-9]+$
I am using the class [0-9]
here to be as portable as possible. You might prefer to use \d
wherever implemented.
It works like this:
^
(0x)?
[0-9]+
$
It gets harder if a preceding "0x" means hex number, and omitted means decimal number:
\b((0x[0-9a-zA-Z]+)|([1-9][0-9]*))\b
This also guards against decimal numbers starting with 0
...
I always like to provide the very baseline REs so they will work on every RE engine, so:
(0x)?[0-9][0-9]*
With suitable boundary conditions (on old RE engines, that would be [ \t]
), that should work everywhere.
However, it looks like you're wanting hex characters, if the 0x
is correct, so maybe you're after:
(0x)?[0-9A-Fa-f][0-9A-Fa-f]*
or it's equivalent in many of the other excellent suggestions for the more advanced engines.
The 0x
you mention suggests you want to capture a hexadecimal number. In that case I suggest:
(?:0x)?[[:xdigit:]]+
where [:xdigit:]
is the list of all hexadecimal number in Posix notation.
It all depends on what you mean by a number, and in what context the numbers are allowed. I assume that numbers preceded by 0x are hexadecimal numbers and thus can also contain A-F and a-f.
Given this test string: "a 012 0xa 4_56 num:8 42!"
This regular expression matches "012"
, "0xa"
, "4"
, "56"
, "8"
and "42"
:
(0x[\dA-Fa-f]+|\d+)
This regular expression matches "012"
, "0xa"
, "8"
and "42"
:
\b(0x[\dA-Fa-f]+|\d+)\b
This regular expression matches "0xa"
, "8"
and "42"
:
\b(0x[\dA-Fa-f]+|[1-9]\d*)\b
This regular expression matches "012"
and "0xa"
:
(?<=\s)(0x[\dA-Fa-f]+|\d+)(?=\s)
This regular expresison matches "0xa"
:
(?<=\s)(0x[\dA-Fa-f]+|[1-9]\d*)(?=\s)