views:

86

answers:

2

I'm working on a simple MySQL full-text search feature on a CakePHP site, and noticed that MySQL strips short words (3 chars or less) out of the query. Some of the items in the site have 3 character titles, however, and I'd like to include them in the results. (I've ruled out using more robust search appliances like Solr due to budget constraints)

So I want to find any 3 character words in the query string, and do a quick lookup just on the title field. The easiest way I can think to do this is to explode() the string and iterate over the resulting array with strlen() to find words of 3 characters. Then I'll take those words and do a LIKE search on the title field, just to make sure nothing that should obviously be in the results was missed.

Is there a better / easier way to approach this?

UPDATE: Yes, I know about the ft_min_word_len setting in MySQL. I don't think I want to do this.

+1  A: 

There is a system option named “ft_min_word_len” by which you can define the minimum length of words to be indexed. You can set the value of this configuration directive to a lower value (eg 2): it's found under the [mysqld] section in your MySQL configuration file. This file is typically found under “/etc/mysql” or “/etc”. In windows you can look under windows directory or MySQL home folder.

[mysqld]
ft_min_word_len=2
Mark Baker
Thanks, I had actually come across this setting before posting the question (probably should have mentioned that). I'm really not sure I want to change the setting for all full-text searching, but I don't want to miss entries with 3 characters in the title. I may play around with this option though.
handsofaten
A: 

I'm going with my original idea for now, unless someone has a better approach not involving ft_min_word_len. (If I could use this on a per-database level, I might consider it -- but otherwise it is too far-reaching.)

I have a function like this:

    $query = str_replace(array(',', '.'), '', $query);
    $terms = explode(' ', $query);
    $short = '';

    foreach($terms as $term){
        if(strlen($term) == 3){
            $short .= '"'.$term.'", ';
        }
    }

    if(!empty($short)){
        $short = trim($short, ', ');
    }

    return $short;

And then I use the returned string to search the title column: WHERE title IN ($short), to supplement a full-text search. I arbitrarily assign a score of 3.5, so that the returned records can be sorted along with the other full-text search hits (I chose a relatively high score, since it is an exact match for the title of the record).

This doesn't feel very elegant to me, but it resolves the problem.

handsofaten