Token Suffix Tree Tutorial | ansaurus

tags:

views:

284

answers:

2

Q:

Token Suffix Tree Tutorial

Can someone please point to tutorials on - "Token Suffix Trees".

Google, gives only links to research papers that are already using them! :(

Thanks in advance.

A:

From googling that same phrase and scanning the first couple of results, my guess is that they are talking about a suffix tree in which the "letters" (or "characters", or "elements") are not individual ASCII or UNICODE characters as we are accustomed to, but rather the lexical tokens from some computer language.

So e.g. for C you would have a "letter" called int, and another letter called (, and so on. I'm not sure exactly how tokens that are subsequences of other tokens (e.g. + is a subsequence of ++) would be handled, but my guess would be that they are handled in the same way the lexer deals with them, which is (for C at least) by always greedily building the longest token (so e.g. the 5 input characters +++++ will be lexed as ++, ++, +).

j_random_hacker 2009-11-18 12:04:48

Yes, you are right - the "letters" are HTML tokens for the project I am looking at. Thanks, for the effort though. :)

Bart J 2009-11-18 18:00:49

A:

Not sure if it is what you are looking for, but your question reminds me of what I know as 'suffix trees on words', e.g. http://www.larsson.dogma.net/words-alg.pdf

Fabian Steeg 2010-06-06 02:59:22

related questions

How do I find the Excel column name that corresponds to a given integer?

Calculating a cutting list with the least amount of off cut waste.

Red-Black Trees

How to maintain a recursive invariant in a MySQL database?

RFC calculation in Java need help with algorithm

Best word wrap algorithm?

How do you separate game logic from display?

Most effective way for float and double comparison

Choosing a multiplier for a (string) hash function

Optimizing a search algorithm in C

Find the best combination from a given set of multiple sets

What "already invented" algorithm did you invent?

Designing a Calendar system like Google Calendar

How to overload std::swap()

Looking for algorithm that reverses the sprintf() function output

Merge Sort a Linked List

Puzzle: Find largest rectangle (maximal rectangle problem)

graph serialization

Peak detection of measured signal

Big O, how do you calculate/approximate it?

What problems can be solved, or tackled more easily, using graphs and trees?

Followup: "Sorting" colors by distinctiveness

Efficiently get sorted sums of a sorted list

Function for creating color wheels

Fastest way to get value of pi