tags:

views:

35

answers:

3

I'm trying to match on band names in a DB by excluding 'The'

So a search for 'The Beatles' or 'Beatles' would both succeed.

This is too slow: select * from artists where artist_name LIKE '%beatles';

Better ways to do this? I'd like to avoid having an extra sorting/matching column with 'the' stripped out.

Thanks!

+2  A: 

Text searching should be handled using Full Text Search (FTS), either with native FTS or 3rd party (IE Sphinx).

OMG Ponies
is full text indexing overkill for just one column? I've always thought of full text as being used over multiple columns.
bandhunt
@bandhunt: If you need the performance, then no - it's not overkill.
OMG Ponies
Yeah, just set up the mysql fulltext and it's looking great.thx!
bandhunt
+1  A: 
  • Try a fulltext index to index the artist column
  • Use a external indexing tool like Sphinx. This will add another tool and index, but it is capable of really good and fast searching.
Sjoerd
+4  A: 

See my presentation Practical Full-Text Search in MySQL that I did for the MySQL University webinar series.

I compare several solutions, including:

  • MySQL FULLTEXT indexing
  • Apache Lucene (though I would recommend checking out Solr)
  • Sphinx Search
  • Inverted indexing
  • Google Custom Search Engine (CSE) and similar search services
Bill Karwin
you don't think full text indexing is overkill for just wanting to strip 'the' out of only one column? I've always thought of full text as being used over multiple columns. viewing your presentation now, looks cool so far.
bandhunt
There's no way for a conventional index to know that 'The ' doesn't belong to the string. And what would you do with a band named "The The"? You could also define an additional one-to-many table with all searchable variations of the band's name.
Bill Karwin
Cool. Trying some of your solutions and they're working great with some initial tests. thx!
bandhunt