Another solution that could be even better depending on your use-case is interned strings. This is how symbols work e.g. in Lisp.
An interned string is a string object whose value is the address of the actual string bytes. So you create an interned string object by checking in a global table: if the string is in there, you initialize the interned string to the address of that string. If not, you insert it, and then initialize your interned string.
This means that two interned strings built from the same string will have the same value, which is an address. So if N is the number of interned strings in your system, the characteristics are:
- Slow construction (needs lookup and possibly memory allocation)
- Requires global data and synchronization in the case of concurrent threads
- Compare is O(1), because you're comparing addresses, not actual string bytes (this means sorting works well, but it won't be an alphabetic sort).
Cheers,
Carl