tags:

views:

104

answers:

3

Hi,

In Java RegEx, how to find out the difference between .(dot) the meta character and the normal dot as we using in any sentence. How to handle this kind of situation for other meta characters too like (*,+,/d,...)

+1  A: 

Escape special characters with a backslash. \., \*, \+, \\d, and so on. If you are unsure, you may escape any non-alphabetical character whether it is special or not. See the javadoc for java.util.regex.Pattern for further information.

Christoffer Hammarström
Escaping non-special characters needlessly might work in some languages but might fail in others, so it's better to not get into the habit.
Tim Pietzcker
+6  A: 

If you want the dot or other characters with a special meaning in regexes to be a normal character, you have to escape it with a backslash. Since regexes in Java are normal Java strings, you need to escape the backslash itself, so you need two backslashes e.g. \\.

Fabian Steeg
+1 for mentioning the double-backslash issue that Java regex have.
Thilo
+1  A: 

Perl-style regular expressions (which the Java regex engine is more or less based upon) treat the following characters as special characters:

.^$|*+?()[{\ have special meaning outside of character classes,

]^-\ have special meaning inside of character classes ([...]).

So you need to escape those (and only those) symbols depending on context (or, in the case of character classes, place them in positions where they can't be misinterpreted).

It may work escaping other characters (needlessly), but some regex engine will treat this as syntax errors, for example \_ will cause an error in .NET.

Some others will lead to false results, for example \< is interpreted as a literal < in Perl, but in egrep it means "word boundary".

So write -?\d+\.\d+\$ to match 1.50$, -2.00$ etc. and [(){}[\]] for a character class that matches all kinds of brackets/braces/parentheses.

If you need to transform a user input string into a regex-safe form, use java.util.regex.Pattern.quote.

Further reading: Jan Goyvaert's blog RegexGuru on escaping metacharacters

Tim Pietzcker