tags:

views:

131

answers:

4

I have an SQL Server 2005 table that has a varchar(250) field which contains keywords that I will use for searching purposes. I can't change the design. The data looks like this...

Personal, Property, Cost, Endorsement

What is the most efficient way to run search queries against these keywords? The only thing I can think of is this...

WHERE Keywords LIKE '%endorse%'
+1  A: 

the most efficient way is to normalize your db design. never store CSV values into a single cell. other than using like you might consider full text search.

Mladen Prajdic
Worthless answer.
Josh Stodola
are you quite sure about it being worthless? i gave you the only other valid option you can use. patindex and all other stuff is just like "Like"
Mladen Prajdic
Outside of normalization (which isn't an option to OP), and PATINDEX, FTS is the other performant option. FTS is, honestly, probably the best option since it is going to be optimized for the volume of data you have and most likely will return better results in less time and with fewer I/O operations than PATINDEX.
Jeremiah Peschka
A: 

If you have the freedom, change the design to reflect a tag system. With a table for these 'tags' which links them to the respective entries.

tharkun
I can't change the design.
Josh Stodola
I'm curious. Even if I could change the design, what is this "tag" system? How is that designed?
Josh Stodola
Just a very simple step of design normalization. Don't store your tags in strings but store them one entry each in a 'tags' table (n:n).
tharkun
Then you can do JOIN queries, count how many times a certain tag is used, count the number of tags per main entry, etc.
tharkun
Can you add to the design without changing the existing? If you think that WHERE Keywords LIKE '%endorse%' is not efficient enough you could have a background process index these strings. E.g. have a cron job, cut these strings up into single units and store them in a tags table and reference them.
tharkun
Searching would definitely be faster that way.
tharkun
this would also remove the comma problem.
tharkun
but then again... you're in a different world
tharkun
I don't see how that normalizes anything. Just introduces a join. I still have repeating data... no?
Josh Stodola
want to normalize it properly, ok. make also a reference table. the tags table has every tag only once, the reference table links the main table with the respective tags. this table has just two columns (can have more) which are the the mainID and the tagID. you can set an index across the two and make it unique for proper integrity. no repeating data anymore.
tharkun
Still... assume I have a main table record that has 6 tags, I will have six of these records all with the same mainID, pointing to different tagID records... no? This just seems to complicate matters further!
Josh Stodola
in a way, yes. but you asked for alternatives and efficiency. proper normalization would definitely increase performance and possiblities. maybe it's not worth the hassle.
tharkun
A: 

You could use PATINDEX()

USE AdventureWorks;
GO
SELECT PATINDEX('%ensure%',DocumentSummary)
FROM Production.Document
WHERE DocumentID = 3;
GO

http://msdn.microsoft.com/en-us/library/ms188395%28SQL.90%29.aspx

Shane Cusson
Thanks. Do you think this is more efficient than LIKE?
Josh Stodola
According to the first comment here: http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/is-patindex-faster-than-like.aspx, Full Text indexing might be an even better way to go. Full Text: http://msdn.microsoft.com/en-us/library/ms142571.aspx
Shane Cusson
+4  A: 

Since normalization is not an option, the next best option is going to be to configure and use Full Text Search. This will maintain an internal search index that will make it very easy for you to search within your data.

The problem with solutions like LIKE '%pattern%' is that this will produce a full table scan (or maybe a full index scan) that could produce locks on a large amount of the data in your table, which will slow down any operations that hit the table in question.

Jeremiah Peschka
Looks like the right answer... but if FTI is not already enabled/configured/whatever in the system, would adding it in be as much a deal-breaker as modifying the table structure?
Philip Kelley
With SQL Server 2005 you will have to get a DBA to install the FTS feature, so depending on the situation this could present difficulties. Once FTS is installed, creating the indexes is simple. This could be an easier sell than normalization, though, because other applications that access the source data won't need to be changed.
Jeremiah Peschka
Thanks for the answer! How do I configure Full Text Search? Can you link me up, please?
Josh Stodola
Installation instructions can be found here: http://msdn.microsoft.com/en-us/library/ms142490(SQL.90).aspxAnd here's a big list of "how to" topics on FTS from Microsoft: http://msdn.microsoft.com/en-us/library/ms142503(SQL.90).aspx
Jeremiah Peschka
Thanks for the help, I think we will finally be installing this soon!!
Josh Stodola