views:

359

answers:

2

This seems like odd behaviour from SQL's full-text-index.

FTI stores number in its index with an "NN" prefix, so "123" is saved as "NN123".

Now when a user searches for words beginning with N (i.e. contains "n*" ) they also get all numbers.

So:

select [TextField]
from [MyTable] 
where contains([TextField], '"n*"')

Returns:

MyTable.TextField
--------------------------------------------------
This text contains the word navigator
This text is nice
This text only has 123, and shouldn't be returned

Is there a good way to exclude that last row? Is there a consistent workaround for this?


Those extra "" are needed to make the wildcard token work:

select [TextField] from [MyTable] where contains([TextField], 'n*')

Would search for literal n* - and there aren't any.

--return rows with the word text
select [TextField] from [MyTable] where contains([TextField], 'text')

--return rows with the word tex*
select [TextField] from [MyTable] where contains([TextField], 'tex*')

--return rows with words that begin tex...
select [TextField] from [MyTable] where contains([TextField], '"tex*"')
A: 

Does this make any difference:

select [TextField] from [MyTable] where contains([TextField], 'n*')

lomaxx
+1  A: 

There are a couple of ways to handle this, though neither is really all that great.

First, add a column to your table that says that TextField is really a number. If you could do that and filter, you would have the most performant version.

If that's not an option, then you will need to add a further filter. While I haven't extensively tested it, you could add the filter AND TextField NOT LIKE 'NN%[0-9]%'

The downside is that this would filter out 'NN12NOO' but that may be an edge case not represented by your data.

Josef