views:

189

answers:

1

I'm working on a predictive text solution and have all the words being retrieved from a Trie based on input for a certain string of characters, i.e. "at" will give all words formed with "at" as a prefix. The problem that I have now, is that we are also supposed to return all other possibilities from pressing these 2 buttons, Button 2 and button 8 on the mobile phone, which would also give words formed with, "au, av, bt, bu, bv, ct, cu, cv" (most of which won't have any actual words.

Can anyone suggest a solution and how I would go about doing this for calculating the different permutations? (at the moment, I'm prompting the user to enter the prefix (not using a GUI right now)

+3  A: 

Welcome to concepts like recursivity and combinatorial-explosion :)

Due to combinatorial explosion, you have to be "smart" about it: if the user wants to enter a legitimate 20 letters word, it is unacceptable for your solution to "hang" trying stupidly tens of millions of possibilities.

So you should only recurse when the trie has at least one entry for your prefix.

Here's a way to generate all prefixes and only recurse when there's a match.

In this example, I faked a trie always saying there's an entry. I made this in five minutes so it can surely be beautified/simplified.

The advantage of such a solution is that it works if the user presses on one, two, three, four or 'n' keys, without needing to change your code.

Note that you probably do not want to add all the words starting with 'x' letters when there are too many. It's up to you to find the strategy that matches best your need (wait for more keypresses to reduce candidates or add most common matches as candidates etc.).

private void append( final String s, final char[][] chars, final Set<String> candidates ) {
        if ( s.length() >= 2 && doesTrieContainAnyWordStartingWith( s ) ) {
            candidates.add( s + "..." ); // TODO: here add all words starting with 's' instead of adding 's'
        }
        if ( doesTrieContainAnyWordStartingWith( s ) && chars.length > 0 ) {
            final char[][] next = new char[chars.length-1][];
            for (int i = 1; i < chars.length; i++) {
                next[i-1] = chars[i];
            }
            // our three recursive calls, one for each possible letter
            // (you'll want to adapt for a 'real' keyboard, where some keys may only correspond to two letters)
            append( s + chars[0][0], next, candidates );
            append( s + chars[0][1], next, candidates );
            append( s + chars[0][2], next, candidates );
        } else {
            // we do nothing, it's our recursive termination condition and
            // we are sure to hit it seen that we're decreasing our 'chars'
            // length at every pass  
        }
    }

    private boolean doesTrieContainAnyWordStartingWith( final String s ) {
        // You obviously have to change this
        return true;
    }

Note the recursive call (only when there's a matching prefix).

Here's how you could call it: I faked the user pressing '1', then '2' and then '3' (I faked this in the chars char[][] array I created):

    public void testFindWords() {
        // Imagine the user pressed 1 then 2 then 3
        final char[][] chars = {
                {'a','b','c'},
                {'d','e','f'},
                {'g','h','i'},
        };
        final Set<String> set = new HashSet<String>();
        append( "", chars, set ); // We enter our recursive method
        for (final String s : set ) {
            System.out.println( "" + s );
        }
        System.out.println( "Set size: " + set.size() );
}

That example shall create a set containing 36 matches because I "faked" that every prefix is legit and that every prefix leads to exactly one word (and I only added the "word" when it's made of at least two letters). Hence 3*3*3 + 3*3, which gives 36.

You can try the code, it's fully working but you'll have to adapt it of course.

In my fake example (user pressing 1,2 then 3), it creates this:

cdh...
afi...
adi...
beg...
cf...
adh...
cd...
afg...
adg...
bei...
ceg...
bfi...
cdg...
beh...
aeg...
ce...
aeh...
afh...
bdg...
bdi...
cfh...
ad...
cdi...
ceh...
bfh...
aei...
cfi...
be...
af...
bdh...
bf...
cfg...
bfg...
cei...
ae...
bd...
Set size: 36

Welcome to real coding :)

Webinator
@adam08: now of course if you're not familiar with recursion you may go with a "simpler" (actually much longer, but "simpler" if you're not at ease with recursion) approach where you'll only deal with "up to n letters", with n being, say, 8 and then go on to write 8 nested loops.
Webinator
Thanks a million WizardOfOdds....I'll stick it in to my code and see if I can get it working. Then go about trying to understand it :-)I'm still coming to terms that you wrote it in five mins
@adam08: it just takes practice (and I type fast)... There are a lot of really wonderful algorithms and techniques and I just happen to be familiar with recursion. I used to compete in division 1 at "TopCoder" (and even solved the three hard problems once, in 1h 15 minutes, now *that* was hard :)
Webinator
WizardOfOdds, I've integrated your code and it works perfectly hardcoded as 3 (characters for example). Where I'm struggling is trying to do it dynamically based on the nymber of characters entered by the user. I guess what I'm asking is, is it possible to have "s.length() >= 3" the number of characters entered here to be passed to this instead of hardcoded? and based on that, this to be also dynamic, in other words, not have to change them manually to run? Otherwise, it does exactly what I want
This is what I referred to when I referred to "this to be also dynamic" append( s + chars[0][0], next, candidates );
Disregard my last two comments - I got it working - thanks again
@adam08: good to know that you got it working: yup basically in the example I copied here I faked an 'hardcoded' 3 key presses 1,2,3 by putting a char[3][] but if you have only two keypresses make it char[2][] or 4 char[4][] etc. Glad to have helped :)
Webinator
WizardOfOdds, I'm wondering if you can help me out with understanding how the append method is working. Can't for the life of me figure out the recursion. I'm trying to comment it put it's just not happening.
@adam08: recursion isn't easy at first... Basically you're carrying with you a mutable collections, the *Set<String>*, to which you add candidates. The *append* method sometimes calls itself, adding more candidates. Recursion here is just one easy way to generate all the permutations without knowing in advance (when writing the source code I mean) how many keypresses will be made. Note that I just realized that you should only add candidates when you have "all your characters". That is, in my example, I added 36 cases but I should only have added 27 of them. It's easy to fix.
Webinator
@adam08: maybe you could try to play a little bit with "easier" recursion, like playing a bit with the recursive Fibonacci to see how recursion works? I fear that sadly SO is good to "answer questions" but that it is not a very good platform to "discuss" / "give help" due to its question/answer/comment limitations.
Webinator
@adam08: you could just add *System.out.println( s )* at the beginning of the happen method, then you should call the testFindWords() method and you should see what's going on behind the scene. Hope it helps :-/
Webinator