In a program I am writing I am doing a lot of string manipulation. I am trying to increase performance and am wondering if using char arrays would show a decent performance increase. Any suggestions?
What kind of manipulation are you doing? Can you post a code sample?
You may want to take a look at StringBuilder which implements CharSequence to improve performance. I'm not sure you want to roll your own. StringBuilder isn't thread safe btw... if you want thread safety look at StringBuffer.
String is already implemented as a char array. What are you planning to do differently? Anyway, between that and the fact that GC for ephemeral objects is extremely fast I would be amazed if you could find a way to increase performance by substituting char arrays.
Michael Borgwardt's advice about small char arrays and using StringBuilder and StringBuffer is very good. But to me the main thing is to try not to guess about what's slow: make measurements, use a profiler, get some definite facts. Because usually our guesses about performance turn out to be wrong.
When you have a very large number of short Strings, using char[]
instead can save quite a bit of memory, which also means more speed due to less cache misses.
But with large Strings, the main thing to look out for is avoiding unnecessary copying resulting fom the immutability of String
. If you do a lot of concatenating or replacing, using StringBuilder
can make a big difference.
Here is an excerpt from the full source of String class from JDK 6.0:
public final class String implements java.io.Serializable,
Comparable<String>, CharSequence {
/** The value is used for character storage. */
private final char value[];
/** The offset is the first index of the storage that is used. */
private final int offset;
/** The count is the number of characters in the String. */
private final int count;
As you can see internally the value is already stored as an array of chars. An array of chars as a data structure has all the limitations of the String class for most string manipulations: Java arrays do not grow, i.e. every time (ok, may be not every single time) your string would need to grow you'd need to allocate a new array and copy the contents.
As suggested earlier it makes sense to use StringBuilder or StringBuffer for most string manipulations.
In fact the following code:
String a = "a";
a=a+"b";
a=a+"c";
When compiled will be automatically converted to use StringBuilder, this can be easily checked with the help of javap.
As a rule of thumb it's rarely advisable to spend time trying to improve performance of the core Java classes, unless you're a world class expert on the matter, simply because this code was written by the world class experts in the first place.
Have you profiled your application? Do you know where the bottlenecks are? That is the first step if the performance is sub par. Well, that and defining what acceptable performance metrics are.
Once you have profiled doing some tasks, you will have percentages of time spent doing things. If you are spending a lot of time manipulating Strings, maybe you can start to cache some of those manipulations? Are you doing some of them repeatedly when doing them only once would suffice (and then use that result again later when it is needed)? Are you copying Strings when you don't need to? Remember, java.lang.String is immutable - so it cannot be changed directly.
I have found several times while optimizing/performance tweaking systems I work on that I do not know where the slowness comes from instinctively. I have seen others (and, shamefully, myself) spend days optimizing something that shows no gain - because it was not the original bottleneck, and was in fact less than 1% of the time spent.
Hope this helps point you in the right direction.