views:

54

answers:

3

Hi

I am planning to implement database search through a website - I know there is full-text search offered by mysql, but turns out that it is not supported for innodb engine (which I need for transaction support). Other options are using sphinx or similar indexing applications. However they require some re factoring of the database structure and may take more time to implement than I have.

So what I decided on was to take each table and concatenate all its relevant columns into a newly added QUERY column. This query column should also recruit from column of other relevant tables.

This accomplished, I will use the 'like' clause on query column of the table to be searched to search to return results of specific domains (group of related tables).

Since my database is not expected to be too huge (< 1mn rows in the biggest table), I am expecting reasonable query times.

Does any one agree with this method or have a better idea?

+4  A: 

You will not be happy with the solution of using LIKE with wildcards. It performs hundreds or thousands of times slower than using a fulltext search technology.

See my presentation Practical Full-Text Search in MySQL.

Instead of copying the values into a QUERY column, I would recommend copying the values into a MyISAM table where you have a FULLTEXT index defined. You could use triggers to do this.

You don't need to concatenate the values together, you just need the primary key column and each of your searchable text columns.

CREATE TABLE OriginalTable (
  original_id SERIAL PRIMARY KEY,
  author_id INT,
  author_date DATETIME,
  summary TEXT,
  body TEXT
) ENGINE=InnoDB;

CREATE TABLE SearchTable (
  original_id BIGINT UNSIGNED PRIMARY KEY, -- not auto-increment
  -- author_id INT,
  -- author_date DATETIME,
  summary TEXT,
  body TEXT,
  FULLTEXT KEY (summary, body)
) ENGINE=MyISAM;
Bill Karwin
A: 

No, it is not optimal since it force to read all the row. But, if you table is small (i don't know what is the meaning of <1mn) then it could be acceptable in some extend.

Also, you can limit the search feature. For example, some sites limit to use the search feature no more that one request x minute while other force you to enter a captcha.

magallanes
+1  A: 

You'll want to add an index to your query column. If there is a wildcard at the beginning of the search expression, MySQL cannot use the index.

If you do any search other than "equals" (LIKE 'test') or "begins with" (LIKE 'test%'), MySQL will have to scan every row. For example, a "contains" search (LIKE '%test%') is unable to use the index.

You could allow an "ends with" ('LIKE %test), but you'd have to build a reversed column to index on so you could actually do LIKE 'test%' in order to use the index.

Any full scan is going to be slow, and the more rows, the slower it will be. The larger the field, the slower it will be.

You can see the limitation of using LIKE. Therefore, you might create a table called Tags, where you link individual key words to each entry rather than using the entire text, but I would still stick to "equals" and "begins with", even with tags.

Using LIKE without the aid of an index should be limited to the rare ad-hoc query or very small data sets.

Marcus Adams