Java tokenizer, C++ tokenizer and strtok | ansaurus

tags:

algorithm

views:

30

answers:

1

Q:

Java tokenizer, C++ tokenizer and strtok

I Want to know what is the algorithmic complexity of following 1. Java String tokenizer 2. C++ STL based tokenizer 3. strtok.

Is there any faster algorithm then rudimentary strtok to tokenize a string based on custom delimeter.

A:

Regarding Java, there are 3 main techniques for tokenizing (String.split(), StringTokenizer and StreamTokenizer). if you refer to the java.util.StringTokenizer class (which tokenizes by breaking the input string S on every occurrence of a character from a given string D), then the complexity is O(|S|*|D|). I.e., if you have only one delimiter char, this will be linear.

Note that the other tokenizers are more powerful in their abilities. String.split() for example can split around any pattern matching a given regex.

Eyal Schneider 2010-10-05 10:14:47

Is there any other algorithm which is better than O(|S|*|D|)

Avinash 2010-10-05 10:31:44

@Avinash: Theoretically yes. A simple implementation can store the delimiters in a hashtable and this results in O(|S|) time in average. I suppose that this is not really useful since the set of delimiters is usually very small.

Eyal Schneider 2010-10-05 10:38:16

related questions

How do I find the Excel column name that corresponds to a given integer?

Calculating a cutting list with the least amount of off cut waste.

Red-Black Trees

How to maintain a recursive invariant in a MySQL database?

RFC calculation in Java need help with algorithm

Best word wrap algorithm?

How do you separate game logic from display?

Most effective way for float and double comparison

Choosing a multiplier for a (string) hash function

Optimizing a search algorithm in C

Find the best combination from a given set of multiple sets

What "already invented" algorithm did you invent?

Designing a Calendar system like Google Calendar

How to overload std::swap()

Looking for algorithm that reverses the sprintf() function output

Merge Sort a Linked List

Puzzle: Find largest rectangle (maximal rectangle problem)

graph serialization

Peak detection of measured signal

Big O, how do you calculate/approximate it?

What problems can be solved, or tackled more easily, using graphs and trees?

Followup: "Sorting" colors by distinctiveness

Efficiently get sorted sums of a sorted list

Function for creating color wheels

Fastest way to get value of pi