views:

1237

answers:

2

Hello folks !

I am looking for a Java implementation of the Generalized Suffix Tree which have the following functionality:

After the creation of the gst from lets say 1000 strings ich want to find out how many of this 1000 strings contains a given other string.

This have to be quiet fast, since the suffix tree in my use case would be created out of 1000 strings each with a size of 1000 and after that i have to find out in how many of this 1000 strings a set of about 100,000 candidate strings each with a length of about 10 occur.

Hope anyone knows something. Sorry for my bad englisch !

Cheers, Christoph

+3  A: 

Try The Semantic Discovery Toolkit. It has an implementation on text/src/java/org/sd/text/radixtree

Marcelo Morales
Er, do you know of any implementations (or even tutorials!) for Token Suffix Trees?
Bart J
+1  A: 

There is a Java implementation of Suffix Trees available at: http://illya-keeplearning.blogspot.com/2009/04/suffix-trees-java-ukkonens-algorithm.html

xamde