As a personal learning exercise, I wrote this regex to split a unary string into parts whose length is increasing powers of two (see also on ideone.com):
for (String s :
new String(new char[500])
.split("(?=(.*))(?<=(?<=(^.*))\\G.*)(?<=(?=\\2\\2.\\1)^.*)")
) {
System.out.printf("%s ", s.length());
}
// prints "1 2 4 8 16 32 64 128 245 "
This uses a combination of capturing during lookarounds, nested lookarounds, matching on backreferences, and infinite length lookbehind (which isn't officially supported in Java but works anyway). Properties of sums of powers of two and the fact that the string has a unary alphabet is also taken advantage of.
This solution is both unreadable and has a horrible performance.
My question is, how would you "optimize" this regex?
- Can you make the regex more readable (it's okay if it performs worse)
- Can you make the regex performs better (it's okay if it's less readable)