views:

208

answers:

6

I often have a need to take a list of objects and group them into a Map based on a value contained in the object. Eg. take a list of Users and group by Country.

My code for this usually looks like:

Map<String, List<User>> usersByCountry = new HashMap<String, List<User>>();
for(User user : listOfUsers) {
    if(usersByCountry.containsKey(user.getCountry())) {
        //Add to existing list
        usersByCountry.get(user.getCountry()).add(user);

    } else {
        //Create new list
        List<User> users = new ArrayList<User>(1);
        users.add(user);
        usersByCountry.put(user.getCountry(), users);
    }
}

However I can't help thinking that this is awkward and some guru has a better approach. The closest I can see so far is the MultiMap from Google Collections.

Are there any standard approaches?

Thanks!

+6  A: 

No one comes to mind as far. I'd just optimize it as follows:

Map<String, List<User>> usersByCountry = new HashMap<String, List<User>>();
for(User user : listOfUsers) {
    List<User> users = usersByCountry.get(user.getCountry());
    if (users == null) {
        users = new ArrayList<User>();
        usersByCountry.put(user.getCountry(), users);
    }
    users.add(user);
}

Commons Collections has a LazyMap, but it's not parameterized. Google Collections doesn't have sort of a LazyMap or LazyList, but I recall that you could probably use a Function for this. Update: polygenelubricants has given a good example for that.

BalusC
It's senseless to use `MapMaker` just to simulate what `Multimap` and `Multimaps.index()` in the very same library do. I know BalusC must have just not known about these, but it's depressing that this became the top-ranked and accepted answer anyway.
Kevin Bourrillion
@Kevin: The OP was apparently actually after the optimization. I'll remove the `MapMaker` suggestion. Thanks for the wakeup, I didn't see the other answers until now :)
BalusC
+1  A: 

When I have to deal with a collection-valued map, I just about always wind up writing a little putIntoListMap() static utility method in the class. If I find myself needing it in multiple classes, I throw that method into a utility class. Static method calls like that are a bit ugly, but they're much cleaner than typing the code out every time. Unless multi-maps play a pretty central role in your app, IMHO it's probably not worth it to pull in another dependency.

Luke Maurer
Also, BalusC's optimization is a good one to know.
Luke Maurer
+1  A: 

It looks like your exact needs are met by LinkedHashMultimap in the GC library. If you can live with the dependencies, all your code becomes:

SetMultimap<String,User> countryToUserMap = LinkedHashMultimap.create();
// .. other stuff, then whenever you need it:
countryToUserMap.put(user.getCountry(), user);

insertion order is maintained (about all it looks like you were doing with your list) and duplicates are precluded; you can of course switch to a plain hash-based set or a tree set as needs dictate (or a list, though that doesn't seem to be what you need). Empty collections are returned if you ask for a country with no users, everyone gets ponies, etc - what I mean is, check out the API. It'll do a lot for you, so the dependency might be worth it.

Carl
+1 Thanks that's good to know but BalusC's optinmization was what I was after.
Damo
A: 

A clean and readable way to add an element is the following:

String country = user.getCountry();
Set<User> users
if (users.containsKey(country))
{
    users = usersByCountry.get(user.getCountry());
}
else
{
    users = new HashSet<User>();
    usersByCountry.put(country, users);
}
users.add(user);

Note that calling containsKey and get is not slower than just calling get and testing the result for null.

starblue
The call at its own is indeed not slower, but the lookup now occurs twice instead of once.
BalusC
I've clarified it.
starblue
+3  A: 

Guava's Multimap really is the most appropriate data structure for this, and in fact, there is Multimaps.index(Iterable<V>, Function<? super V,K>) utility method that does exactly what you want: take an Iterable<V> (which a List<V> is), and apply the Function<? super V, K> to get the keys for the Multimap<K,V>.

Here's an example from the documentation:

For example,

  List<String> badGuys
      = Arrays.asList("Inky", "Blinky", "Pinky", "Pinky", "Clyde");
  Function<String, Integer> stringLengthFunction = ...;
  Multimap<Integer, String> index
      = Multimaps.index(badGuys, stringLengthFunction);
  System.out.println(index);

prints

 {4=[Inky], 5=[Pinky, Pinky, Clyde], 6=[Blinky]}

In your case you'd write a Function<User,String> userCountryFunction = ....

polygenelubricants
+1 It frustrates me that answers involving writing a lot more code than this are ranked higher, just because they were the fastest to come in. :(
Kevin Bourrillion
@Kevin: I was hoping you'd stop by eventually =) By the way, I plan to eventually write Q/A articles on stackoverflow on various Guava classes to demonstrate its capabilities.
polygenelubricants
I stop by only once or twice a day, thus guaranteeing that I never have a chance to get my answers upvoted. <sigh> I think your idea is a great one. I assume you mean posting a question and answering it yourself. You'll get a few people telling you there's something immoral about this, but it is explicitly sanctioned by the broader SO community, since their goal is for SO to have great content.
Kevin Bourrillion
+1 for the example. I'd be curious to the articles. Guava libraries should be brought into attention more.
BalusC
+1  A: 

By using lambdaj you can obtain that result with just one line of code as it follows:

Group<User> usersByCountry = group(listOfUsers, by(on(User.class).getCountry()));

Lambdaj also offers lots of other features to manipulate collections with a very readable domain specific language.

Mario Fusco
+1 that is nice. Looks very usable.
Damo