views:

464

answers:

1

I am new to the field of approximate string matching.

I am exploring uses for the Bitap algorithm, but so far its limited pattern length has me troubled. I am working with Flash, and I dispose of 32 bit unsigned integers and a IEEE-754 double-precision floating-point Number type, which can devote up to 53 bites for integers. Still, I would rather have a fuzzy matching algorithm which can handle longer patterns than 50 chars.

The Wikipedia page of the Bitap algorithm mentions libbitap, which supposedly demonstrates an unlimited pattern length implementation of the algorithm, but I have trouble getting the idea from its sources.

Have you got any suggestions about how to generalise Bitap for patterns of unlimited length, or about another algorithm that can perform fuzzy string matching of a needle near a suggested location in the haystack?

A: 

There's a pretty crear implementation of this algorithm available at google code. Try it. Though I can't understand how to get an exact location (the beginning and ending point in text) of fuzzy match. If you have any idea how to get both beginning and ending points, please share.

StuffHappens
That's where I started from. They work around the limited length problem by performing several searches instead of using an unlimited length implementation.
hristo