views:

139

answers:

6

Hi all. I'm new here.

Problem -- I have something like the following entries, 1000 of them:

args1=msg args2=flow args3=content args4=depth args6=within ==> args5=content
args1=msg args2=flow args3=content args4=depth args6=within args7=distance ==> args5=content
args1=msg args2=flow args3=content args6=within ==> args5=content
args1=msg args2=flow args3=content args6=within args7=distance ==> args5=content
args1=msg args2=flow args3=flow ==> args4=flowbits
args1=msg args2=flow args3=flow args5=content ==> args4=flowbits
args1=msg args2=flow args3=flow args6=depth ==> args4=flowbits
args1=msg args2=flow args3=flow args6=depth ==> args5=content
args1=msg args2=flow args4=depth ==> args3=content
args1=msg args2=flow args4=depth args5=content ==> args3=content
args1=msg args2=flow args4=depth args5=content args6=within ==> args3=content
args1=msg args2=flow args4=depth args5=content args6=within args7=distance ==> args3=content

I'm doing some sort of suggestion method. Say, args1=msg args2=flow args3=flow ==> args4=flowbits

If the sentence contains msg, flow, and another flow, then I should return the suggestion of flowbits.

How can I go around doing it? I know I should scan (whenever a character is pressed on the textarea) a list or array for a match and return the result, but, 1000 entries, how should I implement it?

I'm thinking of HashMap, but can I do something like this?

<"msg,flow,flow","flowbits">

Also, in a sentence the arguments might not be in order, so assuming that it's flow,flow,msg then I can't match anything in the HashMap as the key is "msg,flow,flow".

What should I do in this case? Please help. Thanks a million!

A: 

yes, you can do <"msg,flow,flow","flowbits"> in a hashmap. Is it the best solution, I don't know.

Istao
+3  A: 

A Map's key can be another Map or a Set. Looks like all you need is something like a Map<Set<String>, String> or perhaps a Map<Map<String, String>, Map.Entry<String, String>> - not sure where these "args1","args2" are relevant.

Michael Borgwardt
Hi. I'll read up on the Set<String> as I did not use it before. What about the "arguments might not be in order" issue? Do you have any idea on how to solve it in case I do implement map? Thanks!
Alex Cheng
Please ignore the args1 and args2, it's there as I pulled the entries out from a generated apriori result.
Alex Cheng
@Alex: From what you wrote it looks to me like you want to ignore the order when finding matches. Using a Set as key will do exactly that.
Michael Borgwardt
@Micheal -- good suggestion, I was not aware that "Set" existed -- who snuck that one in?@Alex -- this is probably better than my lame suggestion about alphabetization :-)
Jay Elston
@Michael -- I'm trying the Map and Set as the key. Hopefully I can get it to work.@Jay -- It is still a good suggestion. =D
Alex Cheng
I think I got it working. So whenever I search, I should always create a new Set<String>, then I'll be able to get my value? Nice. Thanks a lot Michael! Assuming *if* later I want to do it in order, should I use Kelly French's approach?
Alex Cheng
@Alex: yes, encapsulating multiple distinct values in a class is a good practice. Note that you'll have to correctly implement hashCode() and equals() in that class for it to work.
Michael Borgwardt
@Michael -- Do you mean for the Arg class? Can you enlighten me what is the importance of both methods? I'm still searching around for articles on it.
Alex Cheng
@Alex: Yes, those methods are necessary for an object to be used as a key in a HashMap (they're defined in java.lang.Object but have to be overridden to allow objects with the same content to be considered "the same"). The probably best explanation is in Joshua Bloch's book "Effective Java", but this article is pretty good as well: http://www.ibm.com/developerworks/java/library/j-jtp05273.html
Michael Borgwardt
@Michael -- Thanks again!
Alex Cheng
A: 

At first glance, this looks like a good use for a parser and formal grammar rather than a collection. ANTLR is a popular parser generator for Java.

Where a parser won't solve your problem is if the arguments can appear in any order. In that situation I would use some sort of Case object that combines rules and actions, and use a simple Map<String,List<Case>> to find the instances that might apply to a given text (you'd extract individual words from the text to probe the map, and could combine the lists returned from each probe).

I don't have the time to give a complete example, but the Case object would probably look something like this:

public interface Case {
    boolean evaluate(String text);
    String result();
}
Anon
Hi, thanks for replying. I am still a bit puzzled to what you have suggested. However if it is possible I would prefer not to use external tools for it. I'll look into ANTLR still, thanks for the link!
Alex Cheng
A: 

You could create an object and encapsulate the logic within it for each line. If the input is exactly as described, you should be able to extract the args with a simple regular expression and capturing groups, then call setters on your object for each arg. Then your data structure would just be a list of these objects.

JRL
A: 

Since the order that the strings appear is not important, you can alphabetize them when you create the key. Suppose you want the same suggestion for, msg,flow,flow and flow,msg,flow and flow,flow,msg -- alphabetized, they are "flow,flow,msg", so that is what you use as the key.

Jay Elston
Hi. Nice suggestion. I will try it. Thanks.
Alex Cheng
+1  A: 

In other words, try not to do all your logic in your parser. Split up the logic so the parser is simply detecting the structure and then build objects to help you enforce the rules. A parser can easily detect the arguments and use them to create a list.

If you create an class to contain your arguments like so:

public class Arg {
    public int number;
    public String value;

    public Arg(int num, String val) {
        this.number = num;
        this.value = val;
    }

    @Override
    public String toString()
    {
   return "[Arg num=" + number + ", value=" + value + "]";
    }

}

then you can put those in a simple Hashtable.

Map<Arg> argList = new HashMap<Arg>();

Then you can do the logic using maybe a counter and contains() or indexOf() etc.

Having the Arg class makes sorting easy too. If you need the list sorted by the argument position, you create a Comparator for that.

import java.util.Comparator;

public class ArgNumComparator implements Comparator<Arg> {
    public int compare(Arg o1, Arg o2) {
       if (o1.number == o2.number) {
        return 0;
       }
       return o1.number < o2.number ? -1 : 1 ;
   }    
}

Sorting by the argument value is even easier since you can reuse the comparedTo() of Strings.

import java.util.Comparator;

public class ArgValComparator implements Comparator<Arg>
{
    public int compare(Arg o1, Arg o2)
   {
       return o1.value.compareTo(o2.value);
   }
}

Then, to do the sorting use the Collections.sort() like so:

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

public class ArgList{
    public static void main(String[] args)  {
        //args1=msg args2=flow args3=content args4=depth args6=within ==> args5=content
        List<Arg> l = new ArrayList<Arg>();  
        // hard-coded example instead of more likely parsing
        l.add(new Arg(1, "msg"));
        l.add(new Arg(2, "flow"));
        l.add(new Arg(3, "content"));
        l.add(new Arg(4, "depth"));
        l.add(new Arg(5, "flow"));
        l.add(new Arg(6, "within"));

    Collections.sort(l, new ArgValComparator()); // take your pick of comparators

    System.out.println(l); // uses the toString() of Arg.
    }
}

EDIT: added a toString() method to Arg and changed the list in the example to have two "flow" args.

Running with the new toString code puts the following to the console:

[[Arg num=3, value=content], [Arg num=4, value=depth], [Arg num=2, value=flow], [Arg num=5, value=flow], [Arg num=1, value=msg], [Arg num=6, value=within]]

As you can see, the two args with value="flow" are now next to each other. To detect multiple args where value="flow" can be done thus:

boolean flowFound = false;
for (Arg arg : l){
   if (arg.value.compareToIgnoreCase("flow") == 0) {
      if (flowFound)  //already found one? {
         // action when second "flow" exists
         System.out.println("2nd flow found");
      }
      else {
         flowFound = true;  // found the first "flow"
      }
   }          
}
Kelly French
Hi Kelly. I'm still trying to digest what you had posted. Thanks.
Alex Cheng
After trying, I think I get it. I printed the list after the Collections.sort and gotten back msg, flow, content, depth, content, within.What should I do if I want to search "msg, flow, content, depth, within" from the sorted list?
Alex Cheng