views:

324

answers:

5

I have a java properties file containing a key/value pair of country names and codes. I will load the contents of this file into a Collection like List or HashMap.

Then, I want users to be able to search for a country, e.g if they type 'Aus' in a textbox and click submit, then I want to search through the collection I have, containing a key/value pair of country codes/names (e.g AUS=>Australia), and return those countries which are found matching.

Is there any more efficient way of doing this, other than looping through the elements of the collection and using charAt()?

+1  A: 

Short of indexing the collection via something like Lucene, then you'd have to manually check by looping through all of the elements. You could use startsWith as opposed to looping over the string:

String userText = ...
for (Map.Entry<String, String> entry : map) {
    boolean entryMatches = entry.getKey().startsWith(userText);
    ...

Or alternatively use regular expressions:

Pattern pattern = Pattern.compile(userText);

for (Map.Entry<String, String> entry : map) {
    boolean entryMatches = pattern.matcher(entry.getKey()).find();
    ...
oxbow_lakes
+1  A: 

Looping with String.contains() is the way unless you want to move in some heavy artillery like Lucene.

Zed
thanks for reminding me of 'contains'
Click Upvote
Gah! Got there first
oxbow_lakes
Of course `"Australia".contains("AUS")` will return false.
oxbow_lakes
Sure, that was not the complete source code. You can go like myString.toLowerCase().equals(value.toLowerCase());, or Pattern.compile(Pattern.quote(myString), Pattern.CASE_INSENSITIVE).matcher(value).find();
Zed
Yea, **obviously** I was going to lowercase it before passing it to contains.
Click Upvote
A: 

Since the list is small enough to load into memory, sort it and then do a binary search, using the static method java.util.Collections.binarySearch(). This returns an index, and works regardless of whether the exact string is in the list or not (although if it's not it returns a negative number, so be sure to check that). Then, starting from that index, just iterative forward to find all the strings with that prefix. As a nice side-effect, the resulting output will be in alphabetical order.

To make the whole thing case insensitive, remember to convert to lowercase when loading the list and of course convert the prefix to lowercase before searching.

Todd Owen
This is just nonsense! You cannot do a `binarySearch` on "AUS" and find "Australia" in any java collection
oxbow_lakes
Obviously, that's assuming the list is loaded and sorted only once, not on every request!
Todd Owen
@oxbox_lakes: I explained how to make it case insensitive. I'm searching for "aus" to find "australia".
Todd Owen
@Todd - it's still nonsense. You can't do a binary search where the *exact* string is not in the collection as you say. Please modify your answer to include a working example if you think I'm wrong. Just try `Collections.binarySearch(singletonList("australia"), "aus")` and you will see that I am right
oxbow_lakes
@oxbow_lakes: If anyone votes me down, I will provide a full example. But first I refer you to the API documentation, which says that this method returns "the index of the search key, if it is contained in the list; otherwise, (-(*insertion point*) - 1). The *insertion point* is defined as the point at which the key would be inserted into the list: the index of the first element greater than the key". So if the return is *x* and *x* < 0, one needs to start reading list items from index -(x+1).
Todd Owen
+3  A: 

If performance is important, you can use a TreeSet or TreeMap to hold the country names, and do the following can be used to identify countries that start with a given string.

NavigableMap<String, String> countries = new TreeMap<String, String>();
countries.put("australia", "Australia");
...

String userText = ...
String tmp = userText.toLower();
List<String> hits = new ArrayList<String>();
Map.Entry<String, String> entry = countries.ceilingEntry(tmp);
while (entry != null && entry.getKey().startsWith(tmp)) {
    hits.add(entry.getValue());
    entry = map.higherEntry(entry.getKey());
}
// hits now contains all country names starting with the value of `userText`, 
// ignoring differences in letter case.

This is O(logN) where N is the number of countries. By contrast a linear search of a collection is O(N)

Stephen C
+1 Very complete. But of course you're assuming that map.higherEntry() is O(1)...is it?
Todd Owen
It could be O(1) or O(logn). Either of these means that calling higherEntry() doesn't change the complexity. If you try to factor in the number of times that higherEntry() is called it gets complicated. It is a function of the length of the prefix string as well as N.
Stephen C
A: 

Well, this is alright. I am trying to look for a string pattern in an ArrayList. For example ArrayList contains {"Abcde", "ghldedp", "klmnop"} and i my search string is "Abc" how can i use Collections.binarysearch() to do this for me?

Let me know if someone has an idea.

Mahesh