Im looking for a small C library to handle utf8 strings.
Specifically, splitting based on unicode delimiters for use with stemming algorithms.
Related posts have suggested:
ICU http://www.icu-project.org/ (I found it too bulky for my purposes on embedded devices)
UTF8-CPP: http://utfcpp.sourceforge.net/ (Excellent, but C++ not C)
Has anyone found any platform independant, small codebase libraries for handling unicode strings (doesnt need to do naturalisation).
Any advice would be appreciated.