tags:

views:

139

answers:

7

We have to lookup some data based on three input data fields. The lookup has to be fast. There are only about 20 possible lookup combinations. We've implemented this using a static HashMap instance where we create a key by concatinating the three data fields. Is there a better way to do this or is this the way to go? Code is below.

Update: I'm not implying that this code is slow. Just curious if there is a better way to do this. I thought there might be a more elegant solution but I'm happy to keep this in place if there are no compelling alternatives!


Create class level static HashMap instance:

private static HashMap map = new HashMap();

How we load data into memory:

private void load(Iterator iterator) {        
    while (iterator.next()) {  
      Object o = it.next();
      key = o.getField1() + "-" + o.getField2() + "-" o.getField3();
      map.put(key, o.getData());
    }
}

And how we look up the data based on the three fields:

private Stirng getData(String f1, String f2, String f3) {
   String key = f1 + "-" + f2 + "-" f3;
   return map.get(key);
}
+6  A: 

Well, the question to ask yourself is of course "is it fast enough?" Because unless your application needs to be speedier and this is the bottleneck, it really doesn't matter. What you've got is already reasonably efficient.

That being said, if you want to squeeze every bit of speed possible out of this routine (without rewriting it in assembly language ;-) you might consider using an array instead of a HashMap, since there are only a small, limited number of keys. You'd have to develop some sort of hash function that hashes each object to a unique number between 0 and 19 (or however many elements you actually have). You may also be able to optimize the implementation of that hash function, although I couldn't tell you how exactly to do that without knowing the details of the objects you're working with.

David Zaslavsky
+2  A: 

You could create a special key object having three String fields to avoid building up the key string:

class MapKey {
  public final String k1;
  public final String k2;
  public final String k3;

  public MapKey(String k1, String k2, String k3) {
    this.k1 = k1; this.k2 = k2; this.k3 = k3;
  }

  public MapKey(Object o) {
    this.k1 = o.getField1(); this.k2 = o.getField2(); this.k3 = o.getField3();
  }

  public int hashCode() {
    return k1.hashCode();  // if k1 is likely to be the same, also add hashes from k2 and k3
  }
}
Zed
You would also need an equals() method, yes? Also, in our case, k1 is likely to be the same, what is the best way to "add hashes from k2 and k3" as you mention?
Marcus
A: 

Another way to get this done is to create an Object to handle your key, with which you can override equals() (and hashCode()) to do a test against an incomming key, testing field1, field2 and field3 in turn.

EDIT (in response to comment):

As the value returned from hashCode() is used by your Map to put your keys into buckets, (from which it then will test equals), the value could theoretically be the same for all keys. I wouldn't suggest doing that, however, as you would not reap the benefits of HashMaps performance. You would essentially be iterating over all of your items in a bucket and testing equals().

One approach you could take would be to delegate the call to hashCode() to one of the values in your key container. You could always return the hashCode from field3, for example. In this case, you will distribute your keys to potentially as many buckets as there are distinct values for field3. Once your HashMap finds the bucket, it will still need to iterate over the items in the bucket to test the result of equals() until it finds a match.

You could create would be the sum of the values returned by hashCode() on all of your fields. As just discussed, this value does not need to be unique. Further, the potential for collision, and therefore larger buckets, is much smaller. With that in mind, your lookups on the HashMap should be quicker.

EDIT2:

the question of a good hash code for this key has been answered in a separate question here

akf
What would you use for the value of hashCode() in this case? (Assuming you wouldn't use the "-" concatenator).
Marcus
This is definitely overengineering in my opinion. Implementing equals() and hashCode() correctly is tricky and should be avoided whenever possible.
Adriaan Koster
+1  A: 

In your case I would keep using the implementation you outlined. For a large list of constant keys mapping to constant data, you could use Minimal Perfect Hashing. As it is not trivial to code this, and I am not sure about existing libraries, you have to consider the implementation cost before using this.

Yuval F
+1  A: 

I think your approach is pretty fast. Any gains by implementing your own hashing algorithm would be very small, especially compared to the effort required.

One remark about your key format. You better make sure that your separator cannot occur in the field toString() values, otherwise you might get key collisions:

field1="a-", field2="b-", field3="c" -> key="a--b--c"
field1="a", field2="-b", field3="-c" -> key="a--b--c"
Adriaan Koster
Good point about separator. It won't happen in our case. But I might refactor to clean it up.
Marcus
+1  A: 

Concatenating strings is a bad idea for creating a key. My main object is that it is unclear. But in practice a significant proportion of implementations have bugs, notably that the separator can actually occur in the strings. In terms of performance, I have seen a program speed up ten percent simply by changing the key for a string hack to a meaningful key object. (If you really must be lazy about code, you can use Arrays.asList to make the key - see List.equals API doc.)

Tom Hawtin - tackline
Assuming you use an object and store that as a key, what would use as the hashCode() value? I'm assuming the use of List.equals you mention would be implemented in the equals() method.
Marcus
The contract of `Object` requires that `hashCode` be changed is `equals` is changed from the default. The `List` API docs define the result of `List.hashCode`.
Tom Hawtin - tackline
Implementing a key Object with correct equals() and hashCode() mehods is much more error-prone than simple String concatenation. I also don't agree that String concatenation is unclear in this use case and cannot believe a 10% speedup by replacing it with a key Object. Do you have any evidence to back up that claim (I am interested if you do)?
Adriaan Koster
Implementing `equals` and `hashCode` is entirely routine. Any junior Java programmer should write the code straight out. I did not lie about the 10% speed increase. Why would you think it not true?
Tom Hawtin - tackline
A: 

Since you only have 20 combinations it might be feasible to handcraft a "give me the index 1..20 of this combination" based on knowing the characteristics of each combination.

Are you in a position to list the exact list of combinations?

Thorbjørn Ravn Andersen