I noticed in the Java 6 source code for String that hashCode only caches values other than 0. The difference in performance is exhibited by the following snippet:
public class Main{
static void test(String s) {
long start = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++) {
s.hashCode();
}
System.out.format("Took %d ms.%n", System.currentTimeMillis() - start);
}
public static void main(String[] args) {
String z = "Allocator redistricts; strict allocator redistricts strictly.";
test(z);
test(z.toUpperCase());
}
}
Running this in ideone.com gives the following output:
Took 1470 ms.
Took 58 ms.
So my questions are:
- Why doesn't String's hashCode() cache 0?
- What is the probability that a Java string hashes to 0?
- What's the best way to avoid the performance penalty of recomputing the hash value every time for strings that hash to 0?
- Is this the best-practice way of caching values? (i.e. cache all except one?)
For your amusement, each line here is a string that hash to 0:
pollinating sandboxes
amusement & hemophilias
schoolworks = perversive
electrolysissweeteners.net
constitutionalunstableness.net
grinnerslaphappier.org
BLEACHINGFEMININELY.NET
WWW.BUMRACEGOERS.ORG
WWW.RACCOONPRUDENTIALS.NET
Microcomputers: the unredeemed lollipop...
Incentively, my dear, I don't tessellate a derangement.
A person who never yodelled an apology, never preened vocalizing transsexuals.