views:

56

answers:

2

I have a table Books in my MySQL database which has the columns Title (varchar(255)) and Edition varchar(20)). Example values for these are "Introduction to Microeconomics" and "4".

I want to let users search for Books based on Title and Edition. So, for example they could enter "Microeconomics 4" and it would get the proper result. My question is how I should set this up on the database side.

I've been told that FULLTEXT search is generally a good way to do things like this. However, because the edition is sometimes just a single character ("4"), full text search would have to be setup to look at individual characters (ft_min_word_len = 1).. This, I've heard, is very inefficient.

So, how should I setup searches of this database? I'm aware the CONCAT/LIKE could be used here.. My question is whether it would be the best way, or whether it would be too slow compared to the alternatives. My Books database has hundreds of thousands of books and a lot of users are going to be searching it..

A: 

You could set a full text index on the title column, and only match numerics on the edition column with standard operators (>, <, =). Though you have to extract the info from the search query in your business layer.

Björn
the problem with this is that I can't think of a rule for discriminating editions. Example Edition values:-"4"-"4th"-"25th anniversary edition"-"special collector edition"
babonk
+1  A: 

If you are looking for accurate Edition searching, bypassing the ft_min_word_len I would recommend to have two fields in your user search form, one for Title, one for Edition, so that you don't have to guess if/where there is an edition mentioned in the search phrase (and Edition can also be optional). In this case the search for Edition in the edition column would be with a "LIKE".

Another solution - since your char sizes are pretty small - is to create a new column dedicated to FullText index/search where you concatenate title and edition. In order not to worry about ft_min_word_len and Edition, concatenate first a word reserved by yourself that is unlikely to be searched by users, like "EDNB", and Edition when it is a number (or small).

In this case, when you create the new column, parse the Edition (this could be done for Title as well):

  • Change all only-digits words n to "EDNB"+n (eg "4" => "EDNB4")
  • Then NewColumn = Title + " " + NewEdition

When the user searches something,

  • identifies all her numbers in the search phrase and replaces them with "EDNB"+n.

You may want to change ft_min_word_len to 3 to catch "1st" or "3rd" (or you could EDNB them as well).

Note that having a new independent search column may be convenient: not for case insensitivity (since Fulltext does that already), but to search for words with accents that are misspelled. You would change all accents to no-accent letter.

For instance, if title is "Recette de la crème brûlée" and edition is "France 4" you would end up with a search column of "recette de la creme brulee france EDNB4". If the user searches for "crëme 4", "ë" is changed to "e", "4" to "EDNB4" and her search phrase becomes "creme EDNB4".

ring0