views:

502

answers:

4

I've been reading Java Concurrency in Practice lately – great book. If you think you know how concurrency works, but then most of the time you face the real issues, it feels like SWAG is the most you can do, then this book will certainly shed some light on the topic. It's sort of scary how many things can actually go wrong when you try to share data between threads. I guess that made me probably a bit crazy about thread-safety. Now my concern is that, with a bit too much synchronization, I may run into some liveness issues. Here's a piece of code to illustrate:

   private final Hashtable<String, AtomicInteger> userSessions =
new Hashtable<String, AtomicInteger>();

   public void registerUser(String userLogin) {
       synchronized(userSessions) {
           AtomicInteger sessionCount = userSessions.get(userLogin);
           if (sessionCount != null) {
               sessionCount.incrementAndGet();
           } else {
               userSessions.put(userLogin, new AtomicInteger(1));
           }
       }
   }

   public void unregisterUser(String userLogin) {
       synchronized(userSessions) {
           AtomicInteger sessionCount = userSessions.get(userLogin);
           if (sessionCount != null) {
               sessionCount.decrementAndGet();
           }
       }
   }

   public boolean isUserRegistered(String userLogin) {
       synchronized(userSessions) {
           AtomicInteger sessionCount = userSessions.get(userLogin);
           if (sessionCount == null) {
               return false;
           }
           return sessionCount.intValue() > 0;
       }
   }

I tried getting it all right: synchronized collection constructed in static section and stored in a static final reference for safe publication, locking on the collection (instead of this - so that I don't block the whole class the code lives in) and using atomic wrapper classes for primitives. The book mentions overdoing this might also cause problems, but it seems I need some more time to fully wrap my head around it. How would you make this code thread-safe and make sure it doesn't suffer from liveness and also performance issues?

EDIT: Turned it into instance methods and variables, originally everything was declared as static - bad, bad design. Also made userSessions private (somehow I left it public before).

+1  A: 

Seen from your code, your synchronisation on _userSessions should suffice because you do not expose the AtomicInteger objects.

The added safety offered by AtomicInteger is not needed in this case, so in essence you use it here as a mutable Integer. You could place a nested static class containing the session count as only attribute in the map if you are worried about the extra overhead in AtomicInteger (or a bit uglier: add int[1]'s to the map as long as they aren't exposed outside of this class.)

rsp
@rsp: I did expect AtomicInteger to be nonessential here, but, just as you said, I thought it would make a convenient mutable Integer. I guess it might convey a wrong message though here - like the real reason of its usage. Do you think it would be then better (not even considering possible overhead) to replace it with a static nested class of my own?
lukem00
Because the nested class can be trivially small I would probably go that way, and as Hardcoded mentioned I'ld use HashMap in stead of Hashtable. (Code like this is often more elegant and performant just doing it the simple way.)
rsp
+8  A: 

Use a ConcurrentHashMap so that you can use putIfAbsent. You don't need to AtomicInteger code to be synchronised.

   public final ConcurrentMap<String, AtomicInteger> userSessions =
       new ConcurrentHashMap<String, AtomicInteger>();

   public void registerUser(String userLogin) {
       AtomicInteger newCount = new AtomicInteger(1);
       AtomicInteger oldCount = userSessions.putIfAbsent(userLogin, newCount);
       if (oldCount != null) {
           oldCount.incrementAndGet();
       }
   }

   public void unregisterUser(String userLogin) {
       AtomicInteger sessionCount = userSessions.get(userLogin);
       if (sessionCount != null) {
           sessionCount.decrementAndGet();
       }
   }

   public boolean isUserRegistered(String userLogin) {
       AtomicInteger sessionCount = userSessions.get(userLogin);
       return sessionCount != null && sessionCount.intValue() > 0;
   }

Note, this leaks...

Attempt at a non-leaky version:

   public final ConcurrentMap<String, Integer> userSessions =
       new ConcurrentHashMap<String, Integer>();

   public void registerUser(String userLogin) {
       for (;;) {
           Integer old = userSessions.get(userLogin);
           if (userSessions.replace(userLogin, old, old==null ? 1 : (old+1)) {
                break;
           }
       }
   }
   public void unregisterUser(String userLogin) {
       for (;;) {
           Integer old = userSessions.get(userLogin);
           if (old == null) {
               // Wasn't registered - nothing to do.
               break;
           } else if (old == 1) {
               // Last one - attempt removal.
               if (userSessions.remove(userLogin, old)) {
                   break;
               }
           } else {
               // Many - attempt decrement.
               if (userSessions.replace(userLogin, old, old-1) {
                   break;
               } 
           }
       }
   }
   public boolean isUserRegistered(String userLogin) {serLogin);
       return userSessions.containsKey(userLogin);
   }
Tom Hawtin - tackline
It might be me, but I don't like creating objects like `newCount` above when it might not be used in the majority of situations.
rsp
rsp: It really does not matter. Get over it! You can write code that attempts not to do that, but it might not be any faster (although it will be more bloated).
Tom Hawtin - tackline
Would you feel better if we inlined the newCount variable?
Adriaan Koster
@Adriaan, no I would not feel better (this was a trick question, right?) @Tom the code pattern the OP showed doesn't look bloated to me. With respect to clean code; there's nothing to get over.
rsp
The original code uses locks, so it's performance under contention wont be great anyway.
Tom Hawtin - tackline
@Tom Hawtin - tacklin: In what way is it better to run the instructions in a loop until it works compared to using locking mechanism? Does it have to do with the fact it's not very likely for another thread to change things up in the middle of a method? Does it perform better? Is it really worth trading readability? Do you think you could add some more explanation to your answer? I'm sure those less familiar with concurrency could use your comments.
lukem00
@lukem00: looping in this manner is a standard methodology for lock-free algorithms
Jason S
lukem00: In my first example `registerUser` has two independent atomic operations. First attempt to insert an entry into a map. If there was no mapping for that key beforehand, then we are done. If there was a mapping, then we haven't changed the map and all that is left to be done is update the counter atomically. There is a leak, because when `unregisterUser` drops the counter to zero, the map entry still exists.
Tom Hawtin - tackline
lukem00: Acquiring a lock does in fact require a loop. The lock-free algorithm only needs to loop a maximum of once per other thread *that makes progress* (technically this could be as bad as O(n^2) for n threads). Lock-free algorithms /tend/ to perform better than locking algorithms (for quick operations). Whether it is worth it depends upon the situation.
Tom Hawtin - tackline
@Tom Hawtin - tacklin: I got to admit, when I saw your answer for the first time, it totally puzzled me. It was like one of those "I know that I know nothing" moments. So I took some time to analyse your snippets and do some background reading, yet I'm afraid I'm still far from enlightenment. I do understand I can use ConcurrentHashMap's putIfAbsent method and that AtomicInteger requires no extra synchronization, yet registerUser still needs to be atomic to work, right? If so, then your 1st version of it seems to be broken. Is that what you meant when you said it leaks?
lukem00
Oh, I see your point with the leak now. Well, to be perfectly honest, I guess I was just being lazy about that part, cause I thought it won't really matter that much with a very limited number of users I have. I totally shouldn't have left it like that! Shame on me. Thanks for pointing it out.
lukem00
Btw. thanks for extra info. Now that you wrote it, it does seem obvious your implementation of registerUser with independent atomic operations works just fine. I dunno why I thought something could go wrong with it. I guess I just don't feel comfortable about other threads doing anything with my data in the middle of an operation, even though they don't necessarily break anything.
lukem00
+1  A: 

Good book, I recently read it myself.

In the code above, the only note I have is that AtomicInteger isn't needed within the synchronized block, but I doubt the performance would be noticeable.

The best way to track performance is to test it. Set up an automated integrated load test around key areas of your framework and track performance. The load test, if it contains wide enough timing windows and a rich use of the work flow, may also catch any deadlocks you've created.

While deadlocks may seem easy to avoid, they can easily appear in a fairly simple workflow pattern.

Class A locks resources then calls B (may as simple as a get/set) which also locks resource. Another thread calls B which locks resources and then calls A causing a deadlock.

When working with a rich framework it is useful to map out workflow to see how classes interact. You may be able to spot this type of problem. However, with really large frameworks, they can slip by. The best defense I've found is to isolate locks to the smallest area possible and be very conscious of a call outside of a class while within a synchronized block. Create a significant number of load tests helps as well.

Jim Rush
+3  A: 

First of all: Don't use the Hashtable! It's old, and very slow.
Additional: Synchronisation on the lower level isn't needed if you synchronize on a higher level already (This is true for the AtomicInteger-thing, as well).

I see different approaches here, according to what use case is needed here.

The read/write-approach

Assuming you call the isUserRegistered method very often and the other methods only now and then, a good way is a read-write-lock: It is allowed to have multiple reads at the same time, but only one write-lock to rule them all (can only be gained if no other lock is accquired).

private static final Map<String, Integer> _userSessions =
  new HashMap<String, Integer>();

private ReadWriteLock rwLock =
  new ReentrantReadWriteLock(false); //true for fair locks

public static void registerUser(String userLogin) {
  Lock write = rwLock.writeLock();
  write.lock();
  try {
       Integer sessionCount = _userSessions.get(userLogin);
       if (sessionCount != null) {
           sessionCount = Integer.valueOf(sessionCount.inValue()+1);
       } else {
           sessionCount = Integer.valueOf(1)
       }
       _userSessions.put(userLogin, sessionCount);
   } finally {
     write.unlock();
   }
}

public static void unregisterUser(String userLogin) {
  Lock write = rwLock.writeLock();
  write.lock();
  try {
       Integer sessionCount = _userSessions.get(userLogin);
       if (sessionCount != null) {
           sessionCount = Integer.valueOf(sessionCount.inValue()-1);
       } else {
           sessionCount = Integer.valueOf(0)
       }
       _userSessions.put(userLogin, sessionCount);
   } finally {
     write.unlock();
   }
}

public static boolean isUserRegistered(String userLogin) {
  boolean result;

  Lock read = rwLock.readLock();
  read.lock();
  try {
       Integer sessionCount = _userSessions.get(userLogin);
       if (sessionCount != null) {
           result = sessionCount.intValue()>0
       } else {
           result = false;
       }
   } finally {
     read.unlock();
   }

   return false;
}

Pro: simple to understand
Con: Won't scale if the write methods are called frequently

The small atomic-operation approach

The Idea is to do small steps, that are all atomic. This will lead to a very good performance anyway, but there are many hidden traps here.

public final ConcurrentMap<String, AtomicInteger> userSessions =
   new ConcurrentHashMap<String, AtomicInteger>();
//There are other concurrent Maps for different use cases

public void registerUser(String userLogin) {
  AtomicInteger count;
  if (!userSession.containsKey(userLogin)){
    AtomicInteger newCount = new AtomicInteger(0);
    count = userSessions.putIfAbsent(userLogin, newCount);
    if (count == null){
      count=newCount;
    }
    //We need ifAbsent here, because another thread could have added it in the meantime
  } else {
    count = userSessions.get(userLogin);
  }
  count.incrementAndGet();
}

public void unregisterUser(String userLogin) {
  AtomicInteger sessionCount = userSessions.get(userLogin);
  if (sessionCount != null) {
    sessionCount.decrementAndGet();
  }
}

public boolean isUserRegistered(String userLogin) {
  AtomicInteger sessionCount = userSessions.get(userLogin);
  return sessionCount != null && sessionCount.intValue() > 0;
}

Pro: scales very well
Con: Not intuitive, is going to be complex quickly, not always possible, many hidden traps

The lock per user approach

This will create locks for different users, assuming that there are a lot of different users. You can create locks or monitors with some small atomic operations and lock on these instead of the full list.
It would be overkill for this small example, but for very complex structures it can be an elegant solution.

Hardcoded
The second `registedUser` has a bug.
Tom Hawtin - tackline
(And that r/w lock is probably inappropriate.)
Tom Hawtin - tackline
(And the map variable needs to be declared as `ConcurrentMap` for `putIfAbsent`.)
Tom Hawtin - tackline
You are right. Fixed it (hopefully). ;-)Thanks for the review.
Hardcoded
That's what I meant with hidden traps: a big sync-block is easy to maintain, even if it doesn't scale very well. Dealing with a bunch of atomic operations is always tricky.
Hardcoded