Do you haven any useful example of the boundary matcher "\G"`? Please give me some real world examples. Java source is appreciated. From "Mastering regular expressions. Jeffrey E. F. Friedl" I got an useful example parsing HTML but I am not sure how if a translation to Java is possible.
views:
64answers:
2
A:
Here's an example:
Pattern p = Pattern.compile("\\Gfoo");
Matcher m = p.matcher("foo foo");
String trueFalse = m.find() + ", " + m.find();
System.out.println(trueFalse);
Pattern p1 = Pattern.compile("foo");
Matcher m1 = p1.matcher("foo foo");
String trueTrue = m1.find() + ", " + m1.find();
System.out.println(trueTrue);
JRL
2010-04-25 16:17:39
Sorry to ask, but where is the "real world" scenario?
Mister M. Bean
2010-04-25 16:22:52
+2
A:
This is a regex-based solution to introduce thousand separators:
String separateThousands(String s) {
return s.replaceAll(
"(?<=\\G\\d{3})(?=\\d)" + "|" + "(?<=^-?\\d{1,3})(?=(?:\\d{3})+(?!\\d))",
","
);
}
This will transform "-1234567890.1234567890"
to "-1,234,567,890.1234567890"
.
See also
- codingBat separateThousands using regex (and unit testing how-to)
- Explanation of how it works, and alternative regex that also uses
\G
.
- Explanation of how it works, and alternative regex that also uses
This one is more abstract, but you can use \G
and fixed-length lookbehind to split
a long string into fixed-width chunks:
String longline = "abcdefghijklmnopqrstuvwxyz";
for (String line : longline.split("(?<=\\G.{6})")) {
System.out.println(line);
}
/* prints:
abcdef
ghijkl
mnopqr
stuvwx
yz
*/
You don't need regex for this, but I'm sure there are "real life" scenarios of something that is a variation of this technique.
polygenelubricants
2010-04-26 06:40:56
That's very nice. I hope other authors will bring some more real world examples to the table.
Mister M. Bean
2010-04-26 07:12:48
I don't see the point of `"(?:%s)|(?:%s)"`? That should just be `"%s|%s"`.
Christoffer Hammarström
2010-04-28 13:32:37
@Christoffer: I was just playing it ultra-safe for the general case where "left" or "right" can have its own top-level alternation, but I think you're right: even in that case, it'll still work.
polygenelubricants
2010-04-28 14:19:37