views:

171

answers:

2

I need a way to search through 50+ columns, both varchar and int, in a specific table. I've seen people recommend Lucene and Sphinx, but I have no clue how to go about using those and MySQL's Full-Text won't allow me to combine more than 16 columns or both varchar and int columns.

+1  A: 

If your table doesn't get updated a lot, or some time lag before changes are reflected in search is acceptable, perhaps you could make a periodically-regenerated derived table that combines all of your columns in a LONGTEXT and has a fulltext index on that.

chaos
the problem is that there are roughly 65,000 rows at this point with about 500 being added weekly.
If the 500 are in a batch, rather than at random, it'd be easy to regenerate the derived table after the batch. Assuming the inserts come at random times, though, 500 a week is still only a little more than one an hour on average. You could probably run an every-minute cron job that would regenerate the derived table if any changes had been made and not feel any particular pain.
chaos
+1 for a denormalised table with MySQL FULLTEXT indexes.I'd look at sphinx as a first option though.
James C
A: 

You might want to consider pivoting your table and chopping it up a bit. Take all your varchar columns and store the data in one table "Search_Strings"; do the same for the int columns:

original table

id
always_field1
always_field2
varstring1
...
varstringx
int1
...
inty

new main_table (same number of rows as original table, but very few columns)

id
always_field1
always_field2

Search_Strings (

id (FK to main_table)
old_column_name
string_value

Search_Ints

id (FK to main_table)
old_column_name
int_value

With this setup, you need a fulltext index only on string_value. Sorry I can't suggest anything with Sphinx or Lucene as I'm not familiar with them.

dnagirl
To index this strings table using Sphinx, you just need to make sure the "strings" table has a unique id key (think sequences). cf: http://www.postneo.com/2009/02/06/sphinx-search-with-postgresql
Gregg Lind