tags:

views:

86

answers:

4

Hi -

I need to hash some strings so I can pass them into some libraries, this is straight forward using the String.hashCode call.

However once everything is processed I'd like to convert the integer generated from the hashCode back into the String value. I could obviously track the string and hashcode values somewhere else and do the conversion there, but I'm wondering is there anything in Java that will do this automatically.

+3  A: 

That is not possible in general. The hashCode is what one would call a one-way-function.

Besides, there are more strings than integers, so there is a one-to-many mapping from integers to strings. The strings "0-42L" and "0-43-" for instance, have the same hash-code. (Demonstration on ideone.com.)

What you could do however, (as an estimate), would be to store the strings you pass into the API and remember their hash-codes like this:

import java.util.*;

public class Main {
    public static void main(String[] args) {

        // Keep track of the corresponding strings
        Map<Integer, String> hashedStrings = new HashMap<Integer, String>();

        String str1 = "hello";
        String str2 = "world";

        // Compute hash-code and remember which string that gave rise to it.
        int hc = str1.hashCode();
        hashedStrings.put(hc, str1);

        apiMethod(hc);

        // Get back the string that corresponded to the hc hash code.
        String str = hashedStrings.get(hc);
    }
}
aioobe
an addition to this approach could be to use BiDiMap (http://commons.apache.org/collections/api-3.1/org/apache/commons/collections/BidiMap.html) if you need to do reverse lookups. Just beware for hash collisions.. :)
posdef
It would be best to use a Map<Integer, List<String>>. That way you can map your hashes to the corresponding strings as their can be multiple.
Carra
That's a good point. But he would still have to guess among the strings `List<String>` which corresponded to a certain hash code.
aioobe
+1  A: 

Not possible to convert the .hashcode() output to the original form. It's a one way process.

You can use a base64 encoder scheme where you will encode the data, use it where ever you want to and then decode it to the original form.

zengr
+10  A: 

I think you misunderstand the concept of a hash. A hash is a one way function. Worse, two strings might generate the same hash.

So no, it's not possible.

Carra
Besides, that's kind of the point of a hash in the first place.
Joeri Hendrickx
+1  A: 

hashCode() is a not generally going to be a bijection, because it's not generally going to be an injective map.

hashCode() has ints as its range. There are only 2^32 distinct int values, so for any object where there there can be more than 2^32 different ones (e.g., think about Long), you are guaranteed (by the pigeonhole principle that at least two distinct objects will have the same hash code.

The only guarantee that hashCode() gives you is that if a.equals(b), then a.hashCode() == b.hashCode(). Every object having the same hash code is consistent with this.

You can use the hashCode() to uniquely identify objects in some very limited circumstances: You must have a particular class in where there are no more than 2^32 possible different instances (i.e., there are at most 2^32 objects of your class which pairwise are such that !a.equals(b)). In that case, so long as you ensure that whenever !a.equals(b) and both a and b are objects of your class, that a.hashCode() != b.hashCode(), you will have a bijection between (equivalence classes of) objects and hash codes. (It could be done like this for the Integer class, for example.)

However, unless you're in this very special case, you should create a unique id some other way.

uckelman