I need to filter alphabetic and non alphanumeric characters from a string to make it an integer.
What is the difference between the regular expression strings
\w
and
\w*
?
I need to filter alphabetic and non alphanumeric characters from a string to make it an integer.
What is the difference between the regular expression strings
\w
and
\w*
?
\w
matches any alphanumerical character (word characters) including underscore (short for [a-zA-Z0-9_]).
Equivalent to [A-Za-z0-9_].
For example, /\w/ matches 'a' in "apple," '5' in "$5.28," and '3' in "3D."
*
Repeats the previous item zero or more times. Greedy, so as many items as possible will be matched before trying permutations with less matches of the preceding item, up to the point where the preceding item is not matched at all.
The \w
code matches a single alphanumeric character, like the set [0-9A-Za-z_]
.
The *
quantifier is the same as the {0,}
quantifier, it repeats the match zero or more times.
Putting a question mark after a quantifier makes it lazy, i.e. it matches as few characters as possible instead of as many as possible.
So, \w*?
matches zero or more alphanumeric characters, lazily.
If you want to filter out characters that can't be in a number, why not just use a negative set? This will match any character that is not a minus sign or a digit:
[^\-\d]