views:

117

answers:

4

Hello!

I am creating a small search function on my site, this enables the search for articles in the system. Each article has a set of keywords associated to it and these keywords are stored inside a SQL Server database.

This is the table:

CREATE TABLE [dbo].[SearchWords] (
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [ArticleID] [int] NOT NULL,
    [SearchWord] [nvarchar](20) NOT NULL,
    CONSTRAINT [PK_SearchWords] PRIMARY KEY CLUSTERED 
        ([ID] ASC) 
        WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE = OFF, 
              IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
              ALLOW_PAGE_LOCKS  = ON)
        ON [PRIMARY]
) ON [PRIMARY]

Each article can have an unlimited amount of keywords. Now my problem is the search itself. When for example the user types:

France actors

I want the system to find all articles (just once DISTINCT) with the keywords France and actors. I am passing the search criteria as a varchar (separated with space) to the stored procedure. Then i am splitting the words up with the following function: (Erland Sommarskog) http://www.sommarskog.se/arrays-in-sql-2005.html#iter-list-of-strings

How do i then match the criteria words against the search words and only get the distinct article ids?

I am working with something like this, just that i cant understand how to match all the keywords. This method works if just one keyword is entered. If the user enters multiple, then it doesn't return anything even if the article has all involved keywords.

declare @temp nvarchar(50)
set @temp = 'France actors'

SELECT DISTINCT Article.ArticleID 
FROM Article
INNER JOIN SearchWords 
    ON Article.ArticleID = SearchWords.ArticleID
JOIN iter_charlist_to_tbl(@temp, DEFAULT) s 
    ON SearchWords.SearchWord = s.nstr

Any ideas?

A: 

Try something like this...

SET @Temp = ','+replace(@temp,' ',',')+','

SELECT DISTINCT article.ArticleID
FROM article
JOIN SearchWords ON Article.ArticleID=SearchWords.ArticleID
WHERE CharIndex(','+SearchWords.SearchWord+',',@temp) > 0

I am not sure how you are separating your input text into separate words, so you might need to change the delimiter from a comma, or do a little more code to build the comma delimited string, but once built, the join and where clause above should do the trick...

Sparky
+2  A: 

You can do this using group by, and then demanding that the number of keywords found equals the total number of keywords:

SELECT SearchWords.ArticleID 
FROM SearchWords 
INNER JOIN iter_charlist_to_tbl(@temp, DEFAULT) s 
    ON SearchWords.SearchWord = s.nstr
GROUP BY SearchWords.ArticleID
HAVING COUNT(*) = (
    select count(*) from iter_charlist_to_tbl(@temp, DEFAULT)
)

By the way, you don't need a join on Articles if you're just looking for ArticleID, so I removed that table.

Andomar
+2  A: 

First off, your linked UDF's default delimiter is the , character, not space. So, using the default delimiter as you are you're getting a single row back with both words in it. (Debugging hint: when something isn't working right, take it apart. In this case, you should have done a select * from UDF(@temp, DEFAULT) to see if the table looked correct.)

Assuming you want to keep using that UDF and you want articles that match any of the search terms (but not necessarily all), something along these lines should be correct:

declare @temp nvarchar(50)
set @temp = 'France actors'

SELECT DISTINCT 
  a.ArticleID 
FROM 
  Article a
  JOIN SearchWords sw ON a.ArticleID = sw.ArticleID
WHERE
  exists (
    select 
      1 
    from 
      iter_charlist_to_tbl(@temp, ' ') s 
    where
      s.nstr = sw.SearchWord
  )

Your inner join method probably should work as well if you change the parameters to the UDF.

Donnie
How would this ensure you match both `France` and `actors`?
Andomar
Ah, right. My error for skimming over the question too fast. Started to rewrite it, but I get the same solution as yours above, and there's no reason to post that twice.
Donnie
I changed your code a little to also return the number of keywords that matched. That way i can display the result that yielded the highest keyword match first.Is there a way to capture the number of records that this yields? I am combining this with a C# SQL-to-Linq class that returns the records. However, i want it also to return an OUTPUT int with the number of records that matches. I need to present the total number of rows found to the user and also benefit from using the .Skip(x) number of rows.
Patrick
Number of records returned is usually a client-side statistic, gotten by checking the appropriate property of the result set
Donnie
+3  A: 

SQL server has Full Text Seaching you can use. instead of crating a search function by yourself, you can use CONTAINs and CONTAINSTABLE or FREETEXT and FREETEXTTable

http://stackoverflow.com/questions/1597138/way-of-searching-30-000-sql-records/1597159#1597159

Henry Gao
Full text is good too because you can also use it match words that are related or sound like your word, so "drive" can also match "drives" and such - if you want to.
eidylon
The problem is that i am not permitted to create full text tables ...
Patrick
Not permitted? Shame, as i think it'll be hard to create a faster custom solution than with using FTS.
Bart Janson