Im using Sphider as a search engine for my website, its really easy to work with but im having some major issues with localized characters.
All of my html/php pages have the charset defined as UTF-8 and the search and result page from Sphider had charset=ISO-8859-1, when I first used the Sphider "spider" to crawl my website it made all of my localized characters into some codification I dont know:
"ç" become "ç" and so on with "ã", "á" etc
When I created the DB in MySql I made it a utf-8_general_ci also my defenitions for the DB are : MySQL charset: UTF-8 Unicode (utf8) MySQL connection collation: utf-8_unicode_ci
This is a real problem because the search wont work properly, if I search "diferença" for instance, in the url it will appear as "?query=diferença&search=1" which is correct but will produce no results in the "suggested search" it will appear as "diferen�a" in case its not visible, the "ç" has become a black square with a white question mark on it.
I believe the spider might have a different working charset but I dont seem able to understand were if it is to be the case. Also being developed towards English primarily I believe its not hard to understand that it has some hiccups along the way.
Does anyone has any experience with it or what should I try to do to solve this?
What really bugging me is not understanding why I get strange symbols in the DB.