views:

349

answers:

2

I'm using MySQL FULLTEXT search (in Natural Language mode). Example:

SELECT Mem_id FROM Members WHERE MATCH (job_title) AGAINST ('".mysql_real_escape_string($keywordsWanted)."')

I noticed that searching for "web developer" or "PHP coder" didn't work too well and so I added ft_min_word_len=3 to the MySQL config file (/etc/mysql/my.cnf). However, two character search terms are still an issue. e.g. "IT contractor", "PR manager", "3D design", etc. These are some strategies I was thinking of trying:

  • Use ft_min_word_len=2 - Won't this make the index MASSIVE and slow the DB down?
  • Alter the query to use LIKE in cases where the keyword is less than 3 characters?
  • Auto-translate common acronyms into full words. e.g. IT->"Information Technology"
  • Change the UI so a Javascript popup warns the user that 2 character words aren't counted so they should rephrase the question
  • Move to use Zend Lucene or another search technology

What do you advise / How have you tackled this problem?

+1  A: 

How about searching with mysql's pattern matching LIKE/REGEXP.

For example:

SELECT * FROM Members WHERE job_title LIKE '".$job_title[0]."%';

Or you can use regular expressions:

SELECT * FROM pet WHERE name REGEXP '^.....$';

You can also store the job title in a separate field or table.

Leon
A: 

For reference, what I ended up doing was changing "ft_min_word_len=2" in the MySQL config file (/etc/mysql/my.cnf). Performance didn't seem to be noticeably worse. Using FTS over LIKE/REGEXP has the advantage of automatic results ranking and automatic plural matching.

By the way, I also had a problem with some of the MYSQL stopwords. In particular "it" and "us" (because they mean "Information Technology" and "United States" in my problem domain). You can find a list of stop words to use here: http://dev.mysql.com/tech-resources/articles/full-text-revealed.html#stopwords. I saved this list to a file called */etc/mysql/stopword_list.txt* and then added "ft_stopword_file='/etc/mysql/stopword_file.txt'" to the MySQL config file. Then I restarted MYSQL (sudo /etc/init.d/mysqld restart) and dropped and recreated my FTS index's.

Tom