views:

144

answers:

4

I have a table with a TEXT column where the contents is just strings of CSV numbers. Example ",1,76,77,115," Each string can have an arbitrary number of numbers.

I am trying to set up Full Text Indexing so that I can search this column rapidly. This works great. Instead of running queries with

where MY_COL LIKE '%,77,%' and MY_COL LIKE '%,115,%'

I can do

where CONTAINS(MY_COL,'77 and 115')

However, when I try to search for a single character it doesn't work.

where CONTAINS(MY_COL,'1')

But I know that there should be records returned! I quickly found that I need to edit the Noise file and rebuild the index. But even after doing that it still doesn't work.

A: 

I fixed my own problem, although I'm not exactly sure what fixed it.

I dropped my table and populated a new one (my program does batch processing) and created a new Full Text Index. Maybe I wasn't being patient enough to allow the indexing to fully rebuild.

galuvian
+2  A: 

Working with relational databases that way is going to hurt.

Use a proper schema. Either store the values in different rows or use an array datatype for the column.

That will make solving the problem trivial.

alex
I agree, this is what I would like to do. This is an 'optimization' of an already normalized schema in order to implement a 'generic' search function... *bangs head into desk*
galuvian
A: 

Agreed. How does 12,15,33 not return that record for a search for 1 with fulltext? Use an actual table schema to accomplish this.

Brian Rudolph
A: 

Hi,

Would you mind telling us how many rows your table contains? And how many CSV numbers that you have on average?

How would this run with 1 bilion rows?

TIA

Henrik Staun Poulsen www.stovi.com

The original LIKE runs like CRAP! My performance testing is with 1 million rows and probably 100m CSV numbers. When we scale up past 4 million rows and 500M CSV numbers performance drops due to reading disk. Full Text queries runs REALLY FAST (<10 s) but takes forever to index. (6 hours on 4M rows)
galuvian