views:

1277

answers:

9

Something happened that I'm not sure should be possible. Obviously it is, because I've seen it, but I need to find the root cause & I was hoping you all could help.

We have a system that looks up latitude & longitude for a zipcode. Rather than access it every time, we cache the results in a cheap in-memory HashTable cache, since the lat & long of a zip code tend to change less often than we release.

Anyway, the hash is surrounded by a class that has a "get" and "add" method that are both synchronized. We access this class as a singleton.

I'm not claiming this is the best setup, but it's where we're at. (I plan to change to wrap the Map in a Collections.synchronizedMap() call ASAP.)

We use this cache in a multi-threaded environment, where we thread 2 calls for 2 zips (so we can calculate the distance between the two). These sometimes happen at very nearly the same time, so its very possible that both calls access the map at the same time.

Just recently we had an incident where two different zip codes returned the same value. Assuming that the initial values were actually different, is there any way that writing the values into the Map would cause the same value to be written for two different keys? Or, is there any way that 2 "gets" could cross wires and accidentally return the same value?

The only other explanation I have is that the initial data was corrupt (wrong values), but it seems very unlikely.

Any ideas would be appreciated. Thanks, Peter

(PS: Let me know if you need more info, code, etc.)

public class InMemoryGeocodingCache implements GeocodingCache
{

private Map cache = new HashMap();
private static GeocodingCache instance = new InMemoryGeocodingCache();

public static GeocodingCache getInstance()
{
    return instance;
}

public synchronized LatLongPair get(String zip)
{
    return (LatLongPair) cache.get(zip);
}

public synchronized boolean has(String zip)
{
    return cache.containsKey(zip);
}

public synchronized void add(String zip, double lat, double lon)
{
    cache.put(zip, new LatLongPair(lat, lon));
}
}


public class LatLongPair {
double lat;
double lon;

LatLongPair(double lat, double lon)
{
    this.lat = lat;
    this.lon = lon;
}

public double getLatitude()
{
    return this.lat;
}

public double getLongitude()
{
    return this.lon;
}
}
A: 

what are you using for your map keys?

james
+5  A: 

Why it's happening is hard to tell. More code could help.

You should probably just be using a ConcurrentHashMap anyway. This will be more efficient, in general, than a synchronized Map. You don't synchronize access to it, it handles it internally (more efficiently than you could).

David M. Karr
+4  A: 

One thing to look out for is if the key or the value might be changing, for instance if instead of making a new object for each insertion, you're just changing the values of an existing object and re-inserting it.

You also want to make sure that the key object defines both hashCode and equals in such a way that you don't violate the HashMap contract (ie if equals returns true, the hashCodes need to be the same, but not necessarily vice versa).

Paul Tomblin
He's using String as the key - there is no need to worry about hash codes
oxbow_lakes
This wasn't originally obvious, as I added the code after the comment was made.
Risser
A: 

Although not the most effecient solution, it does look threadsafe. So it appears that you may be getting a hash collision with your strings. You can easily test this by printing out the hashcode of your two offending zipcode strings and find out if they are the same.

Edit: Oh, if they are you will have to use something that guarentees unique values to give you your hash code. In your case you can easily do this by wrapping the zipcode string inside your singleton with a simple object that has a hashcode that is the integer value of the zipcode.

Robin
I would consider that a possibility, except that running locally, the cache builds and hashes the same 2 zip codes properly. It seems to be a one-time unreproducible (ARGH!) event.
Risser
Hash-collision would not cause the problem assuming that the two zip codes actually had the same hash. The hash structure will have a linked list at a "bucket" in case of this.
oxbow_lakes
+3  A: 

is it possible the LatLonPair is being modified? I'd suggest making the lat and lon fields final so that they are not accidentally being modified elsewhere in the code.

note, you should also make your singleton "instance" and the map reference "cache" final.

james
Risser
Risser
+8  A: 

The code looks correct.

The only concern is that lat and lon are package visible, so the following is possible for the same package code:

LatLongPair llp = InMemoryGeocodingCache.getInstance().get(ZIP1);
llp.lat = x;
llp.lon = y;

which will obviously modify the in-cache object.

So make lat and lon final too.

P.S. Since your key (zip-code) is unique and small, there is no need to compute hash on every operation. It's easier to use TreeMap (wrapped into Collections.synchronizedMap()).

P.P.S. Practical approach: write a test for two threads doing put/get operations in never-ending loop, validating the result on every get. You would need a multi-CPU machine for that though.

Vladimir Dyuzhev
Just make them private!
Egwor
private final :)))
Vladimir Dyuzhev
why make them private? The values are intended to be immutable. Writing a get() method just to make the member variable private is a matter of preference, I suppose - but perhaps it's OK to consider that getters aren't always the right thing?
Kevin Day
You're right, Kevin, private is not a dogma, and is not required in this case if the fields are final. It's just a common practice.
Vladimir Dyuzhev
I agree with the opinion that those fields should be final.
laz
+2  A: 

James is correct. Since you are handing back an Object its internals could be modified and anything holding a reference to that Object (Map) will reflect that change. Final is a good answer.

Javamann
A: 

I don't really see anything wrong with the code you posted that would cause the problem you described. My guess would be that it's a problem with the client of your geo-code cache that has problems.

Other things to consider (some of these are pretty obvious, but I figured I'd point them out anyway):

  1. Which two zip codes were you having problems with? Are you sure they don't have identical geocodes in the source system?
  2. Are you sure you aren't accidentally comparing two identical zip codes?
Jack Leow
A: 

The presence of the has(String ZIP) method implies that you have something like the following in your code:

GeocodingCache cache = InMemoryGeocodingCache.getInstance();

if (!cache.has(ZIP)) {
    cache.add(ZIP, x, y);
}

Unfortunately this opens you up to sync problems between the has() returning false and the add() adding which could result in the issue you described.

A better solution would be to move the check inside the add method so the check and update are covered by the same lock like:

public synchronized void add(String zip, double lat, double lon) {
    if (cache.containsKey(zip)) return;
    cache.put(zip, new LatLongPair(lat, lon));
}

The other thing I should mention is that if you are using getInstance() as a singleton you should have a private constructor to stop the possibility of additional caches being created using new InMemoryGeocodingCache().

Michael Rutherfurd