On matches
vs find
The problem is that you used matches
when you should've used find
. From the API:
- The
matches
method attempts to match the entire input sequence against the pattern.
- The
find
method scans the input sequence looking for the next subsequence that matches the pattern.
Note that String.matches(String regex)
also looks for a full match of the entire string. Unfortunately String
does not provide a partial regex match, but you can always s.matches(".*pattern.*")
instead.
On reluctant quantifier
Java understands (.+?)
perfectly.
Here's a demonstration: you're given a string s
that consists of a string t
repeating at least twice. Find t
.
System.out.println("hahahaha".replaceAll("^(.+)\\1+$", "($1)"));
// prints "(haha)" -- greedy takes longest possible
System.out.println("hahahaha".replaceAll("^(.+?)\\1+$", "($1)"));
// prints "(ha)" -- reluctant takes shortest possible
On escaping metacharacters
It should also be said that you have injected \
into your regex ("\\"
as Java string literal) unnecessarily.
String regexDate = "<b>Expiry Date:<\\/b>(.+?)<\\/td>";
^^ ^^
Pattern p2 = Pattern.compile("<b>Expiry Date:<\\/b>");
^^
\
is used to escape regex metacharacters. A /
is NOT a regex metacharacter.
See also