views:

3381

answers:

13

Note: I am using SQL's Full-text search capabilities, CONTAINS clauses and all - the * is the wildcard in full-text, % is for LIKE clauses only.

I've read in several places now that "leading wildcard" searches (e.g. using "*overflow" to match "stackoverflow") is not supported in MS SQL. I'm considering using a CLR function to add regex matching, but I'm curious to see what other solutions people might have.

More Info: You can add the asterisk only at the end of the word or phrase. - along with my empirical experience: When matching "myvalue", "my*" works, but "(asterisk)value" returns no match, when doing a query as simple as:

SELECT * FROM TABLENAME WHERE CONTAINS(TextColumn, '"*searchterm"');

Thus, my need for a workaround. I'm only using search in my site on an actual search page - so it needs to work basically the same way that Google works (in the eyes on a Joe Sixpack-type user). Not nearly as complicated, but this sort of match really shouldn't fail.

+2  A: 

Greg,

The wildcard character in MS SQL is the % sign and it works just fine, leading, trailing or otherwise.

That said, if you're going to be doing any kind of serious full text searching then I'd consider utilising the Full Text Index capabilities. Using % and _ wild cards will cause your database to take a serious perf hit.

Hope this helps.

Kev
+1  A: 
% Matches any number of characters
_ Matches a single character

I've never used Full-Text indexing but you can accomplish rather complex and fast search queries with simply using the build in T-SQL string functions.

GateKiller
+2  A: 

One thing worth keeping in mind is that leading wildcard queries come at a significant performance premium, compared to other wildcard usages.

Dave Ward
+1  A: 

The problem with leading Wildcards: They cannot be indexed, hence you're doing a full table scan.

Michael Stum
A: 

Using the '%' character I've searched our database using something like the following:

SELECT name FROM TblNames WHERE name LIKE '%overflow'

Using this form or query can be slow at times but we only use it for the occasional manual search.

Swinders
A: 

OP should have been clearer about using FTI :)

Can you cite where you read about the leading * problem in the CONTAINS clause. I've done a reasonable amount of FTI and don't remember it being an issue, or at least it was never reported.

Kev
+5  A: 
Otto
I cannot reproduce this in SQL 2005. Using a * at the front of the search string as shown results in no rows returned.
gregmac
+1  A: 

When it comes to full-text searching, for my money nothing beats Lucene. There is a .Net port available that is compatible with indexes created with the Java version.

There's a little work involved in that you have to create/maintain the indexes, but the search speed is fantastic and you can create all sorts of interesting queries. Even indexing speed is pretty good - we just completely rebuild our indexes once a day and don't worry about updating them.

As an example, this search functionality is powered by Lucene.Net.

Sean Carpenter
+1  A: 

These are some good links for all things FTS.

http://ewbi.blogs.com/develops/ http://www.ewbi.com/ewbi.develop/samples/FullTextSearch.v1.txt

Cheers John

solrev
+1  A: 

Just FYI, Google does not do any substring searches or truncation, right or left. They have a wildcard character * to find unknown words in a phrase, but not a word.

Google, along with most full-text search engines, sets up an inverted index based on the alphabetical order of words, with links to their source documents. Binary search is wicked fast, even for huge indexes. But it's really really hard to do a left-truncation in this case, because it loses the advantage of the index.

+1  A: 

Workaround only for leading wildcard: - store the text reversed in a different field (or in materialised view) - create a full text index on this column - find the reversed text with an *

SELECT * FROM TABLENAME WHERE CONTAINS(TextColumnREV, '"mrethcraes*"');

Of course there are many drawbacks, just for quick workaround...

Not to mention CONTAINSTABLE...

xnagyg
A: 

It is possible to use the wildcard "*" at the end of the word or phrase (prefix search).

For example, this query will find all "datab", "database", "databases" ...

SELECT * FROM SomeTable WHERE CONTAINS(ColumnName, '"datab*"')

But, unforutnately, it is not possible to search with leading wildcard.

For example, this query will not find "database"

SELECT * FROM SomeTable WHERE CONTAINS(ColumnName, '"*abase"')
A: 

So what is the solution for following


But, unforutnately, it is not possible to search with leading wildcard.

For example, this query will not find "database"

SELECT * FROM SomeTable WHERE CONTAINS(ColumnName, '"*abase"')


Shah