views:

196

answers:

3

In Java is there an object like a "Set" that can contain only unique string values, but also contain a count on the number of occurrences of the string value?

The idea is simple

With a data set ala...

A B B C C C

I'd like to add each line of text to a Set-like object. Each time that a non-unique text is added to the set I'd like to also have a numeric value associated with the set to display how many times it was added. So if I ran it on the above data set the output would be something like:

A : 1 B : 2 C : 3

any ideas?

+1  A: 

Yeap, not directly in the core, but can be built easily with a Map.

Here's a naive implementation:

import java.util.Map;
import java.util.HashMap;

public class SetLike {
    private Map<String, Integer> map = new HashMap<String,Integer>();

    public void add( String s ) {
        if( !map.containsKey( s ) ){
            map.put( s, 0 );
        }
        map.put( s, map.get( s ) + 1 );
    }

    public void printValuesAndCounts() {
        System.out.println( map );
    }

    public static void main( String [] args ){
        String [] data = {"A","B","B","C","C","C"};

        SetLike holder = new SetLike();

        for( String value : data ) {
            holder.add( value );
        }

        holder.printValuesAndCounts();
    }
}

Test it

$ javac SetLike.java 
$ java SetLike
{A=1, C=3, B=2}

Of course you can improve it much more. You can implement the Set interface, or a List, or a Collection, etc, you can add the iterators, implement Iterable and so on, it depends on what you want and what you need.

OscarRyz
How would I use a map to get the results above?
rockit
@inSqlHell: I have posted the code. I didn't want to post the idea I've got until I see it running ;)
OscarRyz
+8  A: 

Map<String, Integer> would be the best bet, to put in words what you want to do is to Map the amount of occurrences of a string. Basically have something like this:

public void add(String s) {
    if (map.containsKey(s)) {
     map.put(s, map.get(s) + 1);
    } else {
     map.put(s, 1);
    }
}
Esko
Agreed. I've used Maps for this sort of thing countless times. I think it's the best solution--it doesn't involve much work on your part. The only issue is that retrieving things from the map and dealing with Java's clunky Map.Entry is annoying.
Ellie P.
Thanks - I also wound up changing it into a hashmap
rockit
I wouldn't call Map.Entry clunky, however admittedly it has its oddities such as the capability of getting garbage collected almost immediately which may result in Heisenbugs in single-threaded programs.
Esko
If you use this approach frequently, consider using Google's Multiset. Or write your own Bag<T> wrapper class with an API similar to Multiset's and implemented using a Map<T, Integer>.
Jim Ferrans
+16  A: 

You want a "Bag", like the Bag in Apache Commons Collections or the Multiset in Google Collections. You can add the same value to it multiple times, and it'll record the counts of each value. You can then interrogate the counts.

You'd do something like this with Apache Commons' Bag:

Bag myBag = new HashBag();
myBag.add("Orange");
myBag.add("Apple", 4);
myBag.add("Apple");
myBag.remove("Apple", 2);
int apples = myBag.getCount("Apple");  // Should be 3.
int kumquats = myBag.getCount("Kumquat"); // Should be 0.

And this with Google Collections' Multiset.

Multiset<String> myMultiset= HashMultiset.create();
myMultiset.add("Orange");
myMultiset.add("Apple", 4);
myMultiset.add("Apple");
myMultiset.remove("Apple", 2);
int apples = myMultiset.count("Apple");  // 3
int kumquats = myMultiset.count("Kumquats");  // 0

The problem with Apache Collections in general is that it isn't being very actively maintained, and it doesn't yet support Java Generics. To step into this gap, Google's written their own Collections which are extremely powerful. Be sure to evaluate Google Collections first.

Update: Google Collections also offers Multimap, a "collection similar to a Map, but which may associate multiple values with a single key".

Jim Ferrans
Which fully implements the map<X, Integer> as described in other responses
Mark
Why even mention `Bag` and Apache Collections? Google Collections in general and its `Multiset` in particular is better in every respect.
erickson
Completely agree! I looked at Google first for Bag, forgetting that it was called a Multiset.
Jim Ferrans
Okay! If you aren't really a Commons fanatic then you get +1 from me.
erickson
@sylvarking: I've updated the entry to show both and recommend Google first. (StackOverflow rewards quick incomplete answers over more thoughtful ones, so it seems best to answer quickly and then make sure it's right/complete.)
Jim Ferrans
Someone who can edit -- in place of <code>Multisets.newHashMultiset</code> it should be <code>HashMultiset.create</code>.
Kevin Bourrillion
@Kevin: Thanks, I corrected the example, which was adapted from an article that must have been using an earlier version.
Jim Ferrans