views:

55

answers:

1

There are Python code available for existing algorithms for normal string searching e.g. Boyer-Moore Algorithm. I am looking to use this on Chinese characters and it doesn't seem like the same implementation would work. What would I go about doing in order to make the algorithm work on Chinese characters? I am referring to this:

http://en.literateprograms.org/Boyer-Moore_string_search_algorithm_(Python)#References

+3  A: 

As long as all your text is in unicodes it should work just fine. The algorithm looks sequence-independent, provided each "element" is one sequence-unit in length.

Ignacio Vazquez-Abrams