views:

1277

answers:

2

When using MySQL full text search in boolean mode there are certain characters like + and - that are used as operators. If I do a search for something like "C++" it interprets the + as an operator. What is the best practice for dealing with these special characters?

The current method I am using is to convert all + characters in the data to _plus. It also converts &,@,/ and # characters to a textual representation.

+3  A: 

There's no way to do this in nicely using MySQL's full text search. What you're doing (substituting special characters with a pre-defined string) is the only way to do it.

You may wish to consider using Sphinx Search instead. It apparently supports escaping special characters, and by all reports is significantly faster than the default full text search.

Gary Pendergast
+2  A: 

MySQL is fairly brutal in what tokens it ignores when building its full text indexes. I'd say that where it encountered the term "C++" it would probably strip out the plus characters, leaving only C, and then ignore that because it's too short. You could probably configure MySQL to include single-letter words, but it's not optimised for that, and I doubt you could get it to treat the plus characters how you want.

If you have a need for a good internal search engine where you can configure things like this, check out Lucene which has been ported to various languages including PHP (in the Zend framework).

Or if you need this more for 'tagging' than text search then something else may be more appropriate.

thomasrutter