views:

1034

answers:

4

I had been using Ternary Search Tree for a while, as the data structure to implement a auto complete drop down combo box. Which means, when user type "fo", the drop down combo box will display

foo food football

The problem is, my current used of Ternary Search Tree is case sensitive. My implementation is as follow. It had been used by real world for around 1++ yeas. Hence, I consider it as quite reliable.

My Ternary Search Tree code

However, I am looking for a case insensitive Ternary Search Tree, which means, when I type "fo", the drop down combo box will show me

foO Food fooTBall

Here are some key interface for TST, where I hope the new case insentive TST may have similar interface too.

    /** 
 * Stores value in the TernarySearchTree. The value may be retrieved using key.
 * @param key A string that indexes the object to be stored.
 * @param value The object to be stored in the tree.
 */    
public void put(String key, E value) {
    getOrCreateNode(key).data = value;
}

/**
 * Retrieve the object indexed by key.
 * @param key A String index.
 * @return Object The object retrieved from the TernarySearchTree.
 */
public E get(String key) {
    TSTNode<E> node = getNode(key);
    if(node==null) return null;
    return node.data;
}

An example of usage is as follow. TSTSearchEngine is using TernarySearchTree as the core backbone.

Example usage of Ternary Search Tree

// There is stock named microsoft and MICROChip inside stocks ArrayList.
TSTSearchEngine<Stock> engine = TSTSearchEngine<Stock>(stocks);
// I wish it would return microsoft and MICROCHIP. Currently, it just return microsoft.
List<Stock> results = engine.searchAll("micro");
+2  A: 

I haven't used a TST before, but isn't this as simple as lower or uppercasing your keys, both during storage and during lookup? From your code snippet it looks like that should work.

MrWiggles
No. It cannot be do it that way. Imagine you have original dataset ABC and aBc. If you store it by either convert it "ALL" to upper case, you will only have chance to retrieve ABC. aBc will lose in the space. What my wish is, I provide abc, it returns me ABC and aBc
Yan Cheng CHEOK
But isn't ABC and aBc the value and not the key?
MrWiggles
Yes. ABC and aBc are the value. Please TSTSearchEngine, on how TernarySearchTree is being used.
Yan Cheng CHEOK
I've had alook at your code and it just seems like you need to do the case insentive search when calling getNode(), which means you'll need to provide a case insensitive version of compareCharsAlphabetically
MrWiggles
If you provide a case insensitive version of compareCharsAlphabetically, here is what you get. You pass in key String "abc" for object 0. You pass in key String "ABC" for object 1. object 0 will be overriden by object 1. You may search object 1 using "abc", but object 0 just dissapear in space.
Yan Cheng CHEOK
I understand that, which is why I said you need it for getNode(). You only want the case insensitive search when looking up the node, not inserting, so perhaps have a flag on compareCharsAlphabetically() to do case insensitive or not
MrWiggles
+1  A: 

One of the key factor which make my current Ternary Search Tree difficult to support case insensitive search is that, my underlying data structure is one-to-one mapping. Please look at the following test code :

public void testPut() {
    System.out.println("put");
    Name name0 = new Name("abc");
    Name name1 = new Name("abc");
    TernarySearchTree<Name> instance = new TernarySearchTree<Name>();
    instance.put(name0.toString(), name0);
    instance.put(name1.toString(), name1);
    assertEquals(2, instance.matchPrefix("a").size()); // fail here. Result is 1
}

What my current short-term solution is that, I am using TSTSearchEngine to wrap up the whole TernarySearchTree. TSTSearchEngine is comprised of

(1) A TernarySearchTree, providing UPPER-CASE key to map.

(2) A String-To-ArrayList map.

Here is what happen when I perform :

TSTSearchEngine<Name> engine = TSTSearchEngine<Name>();
engine.put(name0); // name0 is new Name("Abc");
engine.put(name1); // name0 is new Name("aBc");

(1) name0.toString() will be converted to UPPER-CASE ("ABC"). It will be inserted to TernarySearchTree. "ABC" will be both key and value for TernarySearchTree.

(2) "ABC" will use as the key for map, to insert name0 into an array list.

(3) name1.toString() will be converted to UPPER-CASE ("ABC"). It will be inserted to TernarySearchTree. S1 will be both key and value for TernarySearchTree.

(4) "ABC" will use as the key for map, to insert name1 into an array list.

When I try to

engine.searchAll("a");

(1) TernarySearchTree will return "ABC".

(2) "ABC" will be used as the key to access map. Map will return an array list, which is containing name0 and name1.

This solution works. The sample code can be referred to Sample Code for New TSTSearchEngine

However, this may not be an effective solution, as it requires two pass of search. I find out there is an implementation in C++ C++ Implementation of Case Insensitive Ternary Search Tree. Hence, there is an opportunity that C++ code can be ported over to Java.

Yan Cheng CHEOK
A: 

did you get a solution for the same. I am facing same problem.

Ajay
A: 

I've implemented a Ternary search tree, you can have a look at http://kunalekawde.blogspot.com/2010/05/radix-patricia-and-ternary.html As of case insensative, either you map small and caps to one hex/dec value or while getting prefix, check the value.

Kunal Ekawde