views:

161

answers:

3

Hi! I'm trying to create a small search for my site. I've tried using full-text index search, but I could never get it to work. Here is what I've come up with:

if(isset($_GET['search'])) {

$search = str_replace('-', ' ', $_GET['search']);
$result = array();

$titles = mysql_query("SELECT title FROM Entries WHERE title LIKE '%$search%'");
while($row = mysql_fetch_assoc($titles)) {
    $result[] = $row['title'];
}

$tags = mysql_query("SELECT title FROM Entries WHERE tags LIKE '%$search%'");
while($row = mysql_fetch_assoc($tags)) {
    $result[] = $row['title'];
}

$text = mysql_query("SELECT title FROM Entries WHERE entry LIKE '%$search%'");
while($row = mysql_fetch_assoc($text)) {
    $result[] = $row['title'];
}

$result = array_unique($result);
}

So basically, it searches through all the titles, body-text, and tags of all the entries in the DB. This works decently well, but I'm just wondering how efficient would it be? This would only be for a small blog, too. Either way I'm just wondering if this could be made any more efficient.

+1  A: 

MySQL Full-text search works -- I would look into it and debug it rather than trying to do this. Doing 3 separate MySQL queries will not be anywhere near as efficient.

If you want to try to make that much efficient you could separate the LIKE statements in one query with OR between them.

Kerry
+1  A: 

Using LIKE is NOT fulltext.

You need to use ... WHERE MATCH(column) AGAINST('the query') in order to access a fulltext search.

webdestroya
I realize that, I said I tried fulltext and could never get it to work, so I made my own code instead.
WillyG
@iMaster - Oh. Just so you know, your code is very inefficient compared to using the Fulltext... perhaps you should attempt to get it working and solve that problem?
webdestroya
+7  A: 

There's no way to make LIKE '%pattern%' queries efficient. Once you get a nontrivial amount of data, using those wildcard queries performs hundreds or thousands of times slower than using a fulltext indexing solution.

You should look at the presentation I did for MySQL University: http://www.slideshare.net/billkarwin/practical-full-text-search-with-my-sql

Here's how to get it to work:

  1. First make sure your table uses the MyISAM storage engine. MySQL FULLTEXT indexes support only MyISAM tables.

    ALTER TABLE Entries ENGINE=MyISAM;
    
  2. Create a fulltext index.

    CREATE FULLTEXT INDEX searchindex ON Entries(title, tags, entry);
    
  3. Search it!

    $search = mysql_real_escape_string($search);
    $titles = mysql_query("SELECT title FROM Entries 
        WHERE MATCH(title, tags, entry) AGAINST('$search')");
    while($row = mysql_fetch_assoc($titles)) {
        $result[] = $row['title'];
    }
    

    Note that the columns you name in the MATCH clause must be the same columns in the same order as those you declared in the fulltext index definition. Otherwise it won't work.


I've tried using full-text index search, but I could never get it to work... I'm just wondering if this could be made any more efficient.

This is exactly like saying, "I couldn't figure out how to use this chainsaw, so I decided to cut down this redwood tree with a pocketknife. How can I make that work as well as the chainsaw?"


Regarding your comment about searching for words that match more than 50% of the rows.

The MySQL manual says this:

Users who need to bypass the 50% limitation can use the boolean search mode; see Section 11.8.2, “Boolean Full-Text Searches”.

And this:

The 50% threshold for natural language searches is determined by the particular weighting scheme chosen. To disable it, look for the following line in storage/myisam/ftdefs.h:

#define GWS_IN_USE GWS_PROB

Change that line to this:

#define GWS_IN_USE GWS_FREQ

Then recompile MySQL. There is no need to rebuild the indexes in this case.

Also, you might be searching for stopwords. These are words that are ignored by the fulltext search because they're too common. Words like "the" and so on. See http://dev.mysql.com/doc/refman/5.1/en/fulltext-stopwords.html

Bill Karwin
Followed the directions precisely and still no go. I typed in the title of one of the entries and it wasn't returned. Does it matter that I'm testing it locally with MAMP?
WillyG
Ok, so after further testing I've come up with some weird results:It NEVER returns a certain post (as in this one specific post is never returned, not matter what).On the other posts, it doesn't work on searching for text inside the entry, but it works when typing the title and tags.Does that make sense?
WillyG
Aha! I've discovered the problem: it doesn't return anything when it matches EVERY (or more than half) the items in the DB. Any way around this? I guess that's actually more efficient and logical, but I just didn't expect it. Thanks for your help!EDIT: I would like it to match entries when you type part of titles, is that possible? (e.g. search for 'new' gives 'new post' as a result)
WillyG
(Also note the `mysql_real_escape_string` in Bill's example. Without it, your current code is vulnerable to SQL-injection attacks.)
bobince
Oh! Thanks for that list! In that case, ignore the edit part of my previous comment.
WillyG
@iMaster: use `IN BOOLEAN MODE` search to avoid the 50% threshold—though with this kind of search you lose the ability to search by MySQL's guessed ‘relevance’. You can escape the 50% threshold for non-boolean searches by recompiling, though it rarely makes any sense to do so. See http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html for details on this and changing stopword and word minimum length settings, which can be important for making FULLTEXT work sensibly.
bobince
+1 nice answer, Bill
Andrew Heath