views:

146

answers:

5

Hi,

I want to compare the individual words from the user input to individual words from a column in my table.

For example, consider these rows in my table:

ID Name
1  Jack Nicholson
2  Henry Jack Blueberry
3  Pontiac Riddleson Jack

Consider that the user's input is 'Pontiac Jack'. I want to assign weights/ranks for each match, so I can't use a blanket LIKE (WHERE Name LIKE @SearchString).

If Pontiac is present in any row, I want to award it 10 points. Each match for Jack gets another 10 points, etc. So row 3 would get 20 points, and rows 1 and 2 get 10.

I have split the user input into individual words, and stored them into a temporary table @SearchWords(Word).

But I can't figure out a way to have a SELECT statement that allows me to combine this. Maybe I'm going about this the wrong way?

Cheers, WT

A: 

What about this? (this is MySQL syntax, I think you only have to replace the CONCAT and do it with +)

SELECT names.id, count(searchwords.word) FROM names, searchwords WHERE names.name LIKE CONCAT('%', searchwords.word, '%') GROUP BY names.id

Then you would have a SQL result with the ID of the names-table and count of the words that match to that id.

K. Ober
+1  A: 

For SQL Server, try this:

SELECT Word, COUNT(Word) * 10 AS WordCount
FROM SourceTable
INNER JOIN SearchWords ON CHARINDEX(SearchWords.Word, SourceTable.Name) > 0
GROUP BY Word
LittleBobbyTables
Nice, elegant solution. I imagine the OP's table would have to have something linking the individual words back to the original search phrase - so getting the score for the entire phrase would be as simple as joining the phrase to this and summing the word count grouped by the whole phrase. Nice username btw... one of my favourite xkcds :)
Rob Cooney
A: 

You could do it via a common table expression that works out the weighting. For example:

--** Set up the example tables and data
DECLARE @Name TABLE (id INT IDENTITY, name VARCHAR(50));
DECLARE @SearchWords TABLE (word VARCHAR(50));

INSERT INTO @Name
        (name)
VALUES  ('Jack Nicholson')
       ,('Henry Jack Blueberry')
       ,('Pontiac Riddleson Jack')
       ,('Fred Bloggs');

INSERT INTO @SearchWords
        (word)
VALUES  ('Jack')
       ,('Pontiac');

--** Example SELECT with @Name selected and ordered by words in @SearchWords
WITH Order_CTE (weighting, id)
AS (
    SELECT COUNT(*) AS weighting
         , id 
      FROM @Name AS n
      JOIN @SearchWords AS sw
        ON n.name LIKE '%' + sw.word + '%' 
     GROUP BY id
)
SELECT n.name
     , cte.weighting
  FROM @Name AS n
  JOIN Order_CTE AS cte
    ON n.id = cte.id
 ORDER BY cte.weighting DESC;

Using this technique, you can also apply a value to each search word if you wanted to. So you could make Jack more valueable than Pontiac. This would look something like this:

--** Set up the example tables and data
DECLARE @Name TABLE (id INT IDENTITY, name VARCHAR(50));
DECLARE @SearchWords TABLE (word VARCHAR(50), value INT);

INSERT INTO @Name
        (name)
VALUES  ('Jack Nicholson')
       ,('Henry Jack Blueberry')
       ,('Pontiac Riddleson Jack')
       ,('Fred Bloggs');

--** Set up search words with associated value
INSERT INTO @SearchWords
        (word, value)
VALUES  ('Jack',10)
       ,('Pontiac',20)
       ,('Bloggs',40);


--** Example SELECT with @Name selected and ordered by words and values in @SearchWords
WITH Order_CTE (weighting, id)
AS (
    SELECT SUM(sw.value) AS weighting
         , id 
      FROM @Name AS n
      JOIN @SearchWords AS sw
        ON n.name LIKE '%' + sw.word + '%' 
     GROUP BY id
)
SELECT n.name
     , cte.weighting
  FROM @Name AS n
  JOIN Order_CTE AS cte
    ON n.id = cte.id
 ORDER BY cte.weighting DESC;      
JonPayne
A: 

Seems to me that the best thing to do would be to maintain a separate table with all the individual words. Eg:

ID     Word       FK_ID
1      Jack       1
2      Nicholson  1
3      Henry      2
(etc)

This table would be kept up to date with triggers, and you'd have a non-clustered index on 'Word', 'FK_ID'. Then the SQL to produce your weightings would be simple and efficient.

Yellowfog
A: 

How about something like this....

Select id, MAX(names.name), count(id)*10 from names
inner join @SearchWords as sw on 
    names.name like '%'+sw.word+'%'
group by id 

assuming that table with names called "names".

Borik