If I have an immutable string is the hashing algorithm run every time I call hash, or does it remember the value (given that the string can't change)?
+8
A:
It is recomputed.
-[NSString hash] is in fact a call to -[NSCFString hash] (due to toll free bridging).
If you create a program that calls -[NSString hash] on the same string and you break in between the calls and alter the memory backing up the string, you get a re-computed hash value. This tells me there's no caching involved.
(gdb) b -[NSCFString hash]
Breakpoint 1 at 0x3b02fa3
(gdb) r
Breakpoint 1, 0x93652fa3 in -[NSCFString hash] ()
(gdb) c
Continuing.
2009-05-13 14:23:39.003 a.out[1754:813] Hash: -327163326
Note the hash value.
Breakpoint 1, 0x93652fa3 in -[NSCFString hash] ()
(gdb) bt
#0 0x93652fa3 in -[NSCFString hash] ()
#1 0x00001f73 in main () at test.m:10
(gdb) fra 1
#1 0x00001f73 in main () at test.m:10
10 NSLog(@"Hash: %d", [m hash]);
(gdb) info locals
pool = (NSAutoreleasePool *) 0x109760
m = (NSString *) 0x2030
(gdb) x/20x 0x2030
0x2030 <dyld__mach_header+32>: 0xa06f54a0 0x000007c8 0x00001fa2 0x00000012
0xa06f54a0 is the "isa" pointer, 0x00001fa2 is a pointer to the "XXXXXX" string.
(gdb) set {int}0x1fa2 = 0x59595959
alter the "XXXXXX" string to "YYYYXXXX", then continue to the second hash call
(gdb) c
Continuing.
2009-05-13 14:24:35.884 a.out[1754:813] Hash: -246144954
Note the hash value that is different on the as far as ObjC knows immutable string.
The program I've (de)bugged is:
#import <Cocoa/Cocoa.h>
int main()
{
NSAutoreleasePool * pool = [NSAutoreleasePool new];
NSString * m = [NSString stringWithString:@"XXXXXXXXXXXXXXXXXX"];
NSLog(@"Hash: %d", [m hash]);
NSLog(@"Hash: %d", [m hash]);
[pool release];
}
diciu
2009-05-13 11:32:39
awesome answer. Thanks
Ian1971
2009-05-13 12:35:10
Very nice answer
Perspx
2009-05-13 16:25:16
Alternatively you can look at the source code for CFStringRef here: http://opensource.apple.com/source/CF/CF-476.17/CFString.cSearch for 'CFHashCode __CFStringHash(CFTypeRef cf)' for the function that gets called. Note that it (a) does things differently for 8-bit and Unicode string buffers, and (b) only hashes up to a certain number of characters (currently 96). Search for '/* String hashing:' to see the details of the hashing algorithm(s).
Jim Dovey
2009-05-13 18:43:22
Good tip Jim, I wasn't aware the source was available online.
Ian1971
2009-05-14 07:59:17