Eventually, I'm creating a website on three languages: english, russian and chinese. I hope that if I use UTF-8 in application and database, there won't be any problems with input-output (will there?)
But the most frightening part of it is a search. It should be cool enough. It should be full-text, it should index, etc. I hope it will understand morfology, use stemming, etc.
First, I've looked at Zend_Search_Lucene, but as I realised from http://framework.zend.com/issues/browse/ZF/component/10021 it has problems with Chinese. :(
Now I'm thinking about Sphinx. It supports both English and Russian stemming. I'm not sure how good is it with Chinese and I have no idea how hard will it be for me to add support for it. http://www.sphinxsearch.com/forum/view.html?id=1554 is a silver lining but, as not experienced Sphinx user, I don't think I understand what is said there.
So,
does anyone have any experience in such 'language-agnostic' search and can share it with me, please?
and can you give me something to test the search. As a native Russian speaker with some basic knowledge of English I can test both Russian and English searches by myself, but I don't even know about which parts of this Chinese pics are words. Please, give me some Chinese strings to put them into index and some queries with expected results!