views:

19

answers:

1

I have a table of URLs, and I need to find any URLs that contain a keyword, such as "fogbugz" or "amazon". If the keyword appears anywhere in the URL, I'd like it to be a result.

Right now, I'm trying:

SELECT url FROM url WHERE url LIKE '%keyword%'

Which is predictably slow, even when url is indexed.

I'm open to alternative methods for indexing this quickly, including scripting methods. I was thinking about going through each URL, splitting it up by "/" then "_ or -" characters, and storing those in a separate url2keyword table. With this method, a URL like: "http://www.foo.com/directory/article-name" would have "foo" "directory" "article" "name" as its keywords. Just an idea.

Of course, if there's a simpler way to do this in MySQL, or in general, I'd prefer that.

I'd like your suggestions on this matter! Thanks.

+1  A: 

You can try full text search (http://dev.mysql.com/doc/refman/5.1/en/fulltext-search.html). The disadvantage is that you will get approximate results as well, so you will probably have to run another pass over the results to pick out the ones you do and don't want.

Jasper
I just created a MyISAM table that has url's in it, and the like '%whatever%' search takes 1/20th the amount of time that it does in InnoDB... what gives?
Sambo
@Sambo: INNODB carries overhead - transactions, relational integrity, etc.
OMG Ponies
+1: Beat me to it.
OMG Ponies
It appears to be working perfectly. Thanks.
Sambo