It's unclear from your question exactly what the code is supposed to do. Does it find the first '.' and replace everything up to it with a '*'? Or is there some fancier logic behind it? Maybe everything up to the nth '.' gets replaced by '*'?
If you're trying to find an instance of a particular string, use something like the Boyer-Moore algorithm. It should be able to find the match for you and you can then replace what you want.
Keep in mind that String
in Java is immutable. It might be faster to change the sequence in-place. Check out other CharSequence
implementations to see what you can do, e.g. StringBuffer and CharBuffer. If concurrency is not needed, StringBuilder might be an option.
By using a mutable CharSequence
instead of the methods on String
, you avoid a bunch of object churn. If all you're doing is replacing some slice of the underlying character array with a shorter array (i.e. {'*'}
), this is likely to yield a speedup since such array copies are fairly optimized. You'll still be doing an array copy at the end of the day, but it may be faster than new String
allocations.
UPDATE
All the above is pretty much hogwash. Sure, maybe you can implement your own CharSequence
that gives you better slicing and lazily resizes the array (aka doesn't actually truncate anything until it absolutely must), returning Strings
based on offsets and whatnot. But StringBuffer
and StringBuilder
, at least directly, do not perform as well as the solution poly posted. CharBuffer
is entirely inapplicable; I didn't realize it was an nio class earlier: it's meant for other things entirely.
There are some interesting things about poly's code, which I wonder whether he/she knew before posting it, namely that changing the "*"
on the final lines of the methods to a '*'
results in a significant slowdown.
Nevertheless, here is my benchmark. I found one small optimization: declaring the '.'
and "*"
expressions as constants adds a bit of a speedup as well as using a locally-scoped StringBuilder
instead of the binary infix string concatenation operator.
I know the gc()
is at best advisory and at worst a no-op, but I figured adding it with a bit of sleep time might let the VM do some cleanup after creating 1M String
s. Someone may correct me if this is totally naïve.
Simple Benchmark
import java.util.ArrayList;
import java.util.Arrays;
public class StringSplitters {
private static final String PREFIX = "*";
private static final char DOT = '.';
public static String starOutFirst(String s) {
final int k = s.indexOf(DOT);
return PREFIX + s.substring(k);
}
public static String starOutFirstSb(String s) {
StringBuilder sb = new StringBuilder();
final int k = s.indexOf(DOT);
return sb.append(PREFIX).append(s.substring(k)).toString();
}
public static void main(String[] args) throws InterruptedException {
double[] firstRates = new double[10];
double[] firstSbRates = new double[10];
double firstAvg = 0;
double firstSbAvg = 0;
double firstMin = Double.POSITIVE_INFINITY;
double firstMax = Double.NEGATIVE_INFINITY;
double firstSbMin = Double.POSITIVE_INFINITY;
double firstSbMax = Double.NEGATIVE_INFINITY;
for (int i = 0; i < 10; i++) {
firstRates[i] = testFirst();
firstAvg += firstRates[i];
if (firstRates[i] < firstMin)
firstMin = firstRates[i];
if (firstRates[i] > firstMax)
firstMax = firstRates[i];
Thread.sleep(100);
System.gc();
Thread.sleep(100);
}
firstAvg /= 10.0d;
for (int i = 0; i < 10; i++) {
firstSbRates[i] = testFirstSb();
firstSbAvg += firstSbRates[i];
if (firstSbRates[i] < firstSbMin)
firstSbMin = firstSbRates[i];
if (firstSbRates[i] > firstSbMax)
firstSbMax = firstSbRates[i];
Thread.sleep(100);
System.gc();
Thread.sleep(100);
}
firstSbAvg /= 10.0d;
System.out.printf("First:\n\tMin:\t%07.3f\tMax:\t%07.3f\tAvg:\t%07.3f\n\tRates:\t%s\n\n", firstMin, firstMax,
firstAvg, Arrays.toString(firstRates));
System.out.printf("FirstSb:\n\tMin:\t%07.3f\tMax:\t%07.3f\tAvg:\t%07.3f\n\tRates:\t%s\n\n", firstSbMin,
firstSbMax, firstSbAvg, Arrays.toString(firstSbRates));
}
private static double testFirst() {
ArrayList<String> strings = new ArrayList<String>(1000000);
for (int i = 0; i < 1000000; i++) {
int first = (int) (Math.random() * 128);
int second = (int) (Math.random() * 128);
int third = (int) (Math.random() * 128);
int fourth = (int) (Math.random() * 128);
strings.add(String.format("%d.%d.%d.%d", first, second, third, fourth));
}
long before = System.currentTimeMillis();
for (String s : strings) {
starOutFirst(s);
}
long after = System.currentTimeMillis();
return 1000000000.0d / (after - before);
}
private static double testFirstSb() {
ArrayList<String> strings = new ArrayList<String>(1000000);
for (int i = 0; i < 1000000; i++) {
int first = (int) (Math.random() * 128);
int second = (int) (Math.random() * 128);
int third = (int) (Math.random() * 128);
int fourth = (int) (Math.random() * 128);
strings.add(String.format("%d.%d.%d.%d", first, second, third, fourth));
}
long before = System.currentTimeMillis();
for (String s : strings) {
starOutFirstSb(s);
}
long after = System.currentTimeMillis();
return 1000000000.0d / (after - before);
}
}
Output
First:
Min: 3802281.369 Max: 5434782.609 Avg: 5185796.131
Rates: [3802281.3688212926, 5181347.150259067, 5291005.291005291, 5376344.086021505, 5291005.291005291, 5235602.094240838, 5434782.608695652, 5405405.405405405, 5434782.608695652, 5405405.405405405]
FirstSb:
Min: 4587155.963 Max: 5747126.437 Avg: 5462087.511
Rates: [4587155.963302752, 5747126.436781609, 5617977.528089887, 5208333.333333333, 5681818.181818182, 5586592.17877095, 5586592.17877095, 5524861.878453039, 5524861.878453039, 5555555.555555556]