views:

124

answers:

5

I am using Java and I am looking for String Collections (Sets and Lists) that are optimized in space and are fast. My strings are of fixed size: either 3 or 5 chars long.

Please suggest to me if there are any collection libraries available that can be best suited to me. I was thinking of some dictionary based collections.

Thanks.

A: 

Assuming you are talking about C or C++, because I can't imagine any other language where someone would be looking for a string library, I'd advise using bstring by Paul Hsieh.

Though I've never used it myself, because it just didn't work in my case, I've adapted it to my own usage back in 2007 taking its concepts as base. It's very well documented and at very least you can learn a big deal about strings just going to those links and reading on Paul's material.

Cawas
+1  A: 

If I wanted speed I would use C++ and the STL and a custom string class fixed to 8 bytes. 8 bytes is nicely aligned and is 64 bits so can be compared in a single machine instruction.

Using the STL you can choose to use a std::set, a std::map, an unordered_set, a std::list, or any other STL compatible structure.

Zan Lynx
Hello I am looking to optimize java code. The application uses lots of string collections and my strings are of fixed sizes
niraj
@niraj: You did not say that in your question. I will edit your question for you but you need to state what languages and platforms you are asking about in your questions.
Zan Lynx
+2  A: 

'dictionary based collections'? HashMap is default choice. It is as fast as O(1). And it has nothing with size of element fixed or not.

卢声远 Shengyuan Lu
+1  A: 

If you mean a Collection of Strings, I'd go with Java's default HashSet. If you need something even faster (in terms of lookup time), you can use a Trie. Tries give very fast lookup (O(length of string)) irrespective of the number of strings in the data structure, and can be very compact.

But, please test your code with HashSet first. With up to several million small sized strings, I don't imagine it would be very slow.

MAK
A: 

You can't really have a "fast collection" in general, because each datastructures have their own strength and weakness.

If you want fast addition and iteration, ArrayLists are good. If you do quite a lot of removal, you might want to use LinkedLists. If you want fast look ups, HashSets are good, etc.

If you have concurrent access, there are other potentially better suited datastructures, too. Sometimes combining more than one datastructure could help, too.

In short, you need to tell us for what you are gonna use your datastructure.

Enno Shioji