views:

307

answers:

4

This question is a continuation of this thread:

In short: To solve my problem, I want to use Map<Set<String>, String>.

However, after I sort my data entries in Excel, remove the unnecessary parameters, and the following came out:

flow content ==> content content 
flow content ==> content depth distance 
flow content ==> content depth within 
flow content ==> content depth within distance 
flow content ==> content within 
flow content ==> content within distance 

I have more than one unique key for the hashmap if that is the case. How do I go around this... anyone have any idea?

I was thinking of maybe Map<Set <String>, List <String>> so that I can do something like:

Set <flow content>, List <'content content','content depth distance','content depth within ', ..., 'content within distance'>

But because I am parsing the entries line by line I can't figure out the way how to store values of the same repeated keys (flow content) into the same list and add it to the map.

Anyone have a rough logic on how can this be done in Java?

Thanks in advance.

--EDIT:

Trying Multimap but somehow have slight problem:

public static void main(String[] args) {

    File file = new File("apriori.txt");
    Multimap<Set <String>, String> mm = HashMultimap.create();
    Set<String> s = null;
    List l = null;

    BufferedReader br = null;

    try {
            br = new BufferedReader(new FileReader(file));
            String line = "";

            while ((line = br.readLine()) != null) {
                //Regex delete only tokenize

                String[] string = line.split(";");
                System.out.println(string[0] + " " + string[1]);

                StringTokenizer st = new StringTokenizer(string[0].trim());
                while (st.hasMoreTokens()) {
                    //System.out.println(st.nextToken());
                    s = new HashSet<String>();
                    s.add(st.nextToken());
                }
                mm.put(s,string[1]);
            }

        // dispose all the resources after using them.
        br.close();
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }

    Set<String> t = new HashSet<String>();
    t.add("content");
    t.add("by");

    String str = mm.get(t).toString();
    System.out.println(str);

    for (Map.Entry<Set <String>, String> e : mm.entries()) {
        System.out.println(e);
    }
}

The apriori.txt

byte_jump ; msg 
byte_jump ; msg by 
content ; msg 
content by ; flow 
content by ; msg 
content by ; msg flow 
content by byte_jump ; msg 
content byte_jump ; by 
content byte_jump ; msg 
content byte_jump ; msg by

Apparently the output for the forloop:

[content]= msg 
[by]= flow 
[by]= msg 
[by]= msg flow 
[byte_jump]= msg 
[byte_jump]= by 
[byte_jump]= msg by 

instead of [content by]= msg flow

Why is that so? I tried and it works. But I need Set to compare the strings regardless of position. What can I do?

A: 

A multimap allows multiple values for a specific key.

One implementation is the various Multimaps which are provided as part of Google Collections.

Rather than coding a way to correctly store data into a Map<String, List<String>, it would probably be a better choice to go ahead and use the appropriate data structure for the job.

coobird
Hi coobird, so I need to download the Google Collections Library and include it in my import? I'm new to using external frameworks.
Alex Cheng
@Alex Cheng: Bingo! The exact way will probably depend on which IDE (if any) is being used. In Eclipse one would add the JAR file to the "build path", but specifically in Java-speak, one would have to add the JAR to the "classpath" in order for the Java compiler to know where to find classes from external libraries.
coobird
@coobird: Wow I checked on the GCL and it seems that it is pretty awesome. I am using Netbeans... probably somewhere in the setting that I must tinker. I'll try. Thanks. Hopefully I can get it to work.
Alex Cheng
@Alex Cheng: It looks like right-clicking on the "Libraries" folder of the Netbeans project gives the option to add JAR files to the project.
coobird
@coobird: Yea thanks. I figured that one out. But one question though, how do I display the values of the Collections associated with a key? I'm using `Multimap<Set <String>, String>` and wanted to use the get() but don't know what to assign it to.
Alex Cheng
+2  A: 

The logic is essentially:

  • map to a list, as you suggest
  • to put something in the map, retrieve the list that corresponds to that key
  • if the list is null, create a new one and map the key to that new list
  • add the item to the list

As another poster has mentioned, you could consider a standard multi-map library class such as that provided in Google Collections. (I personally would just implement it myself because it's really simple and doesn't really warrant a whole additional library in my view, but mileage varies.)

Neil Coffey
@Neil Coffey: Oh crap I didn't think of mapping a list into the map first and retrieve it later so that I can add items. Argh. I'm not that good in implementing it myself (if you're talking about Multimap). I might consider either GCL or continue from where I was. Thanks.
Alex Cheng
Well, I mean just implement the scheme I mention-- it's really a common idiom and a couple of lines of code. My point is just that relying on a library for this one simple piece of functionality really is using a sledgehammer to crack a nut.
Neil Coffey
Using a library for a specific job the library was designed, implemented and tested (!) for is exactly what libraries are meant for. Why are people always against using libraries and everybody is trying to reinvent wheels?
Willi
A: 
public static void main(String[] args) throws IOException {

    final File file = new File("apriori.txt");
    final Multimap<String, String> map = HashMultimap.create();

    final BufferedReader reader = new BufferedReader(new FileReader(file));

    while (true) {
        final String line = reader.readLine();
        if (line == null) break;
        final String[] parts = line.split(" ; ");
        map.put(parts[0].trim(), parts[1].trim());
    }

    for (Map.Entry<String, String> e : map.entries()) {
        System.out.println(e);
    }
}

Should do the trick. (I didn't compile it, no guarantees though.)
Make sure you use Multimap<String, String>, no need to use a single element set as a key there.

Willi
Hi, mine works with `Multimap<String, String>` too. Is it possible to use it with `Multimap<Set<String>, String>`? I can't get the latter to work.
Alex Cheng
Why do you want to use `Set<String>` instead of `String` as key?
Willi
+1  A: 

Regarding your code with MultiMap: the only thing you're doing wrong is to create a new set for every token instead of putting all the tokens of a line into the same set. That's also why you're missing tokens. This works:

s = new HashSet<String>();
while (st.hasMoreTokens()) {
    //System.out.println(st.nextToken());
    s.add(st.nextToken());
}
Michael Borgwardt
@Michael: Oh my god Michael you're a lifesaver! Thank you very much!
Alex Cheng