views:

161

answers:

3

Basically I have a table like this:

CREATE TABLE Person(
    PersonID int IDENTITY(1,1) NOT NULL,
    FirstName nvarchar(512) NOT NULL,
    LastName nvarchar(512) NULL
)

And I need to find the top n results based on a user-query like this:

"Joh Smi"

The following query returns the results I need (I think). Just not in the relevant order.

SELECT
    PersonID, FirstName, LastName
FROM
    Person
WHERE
    FirstName LIKE 'Joh%' OR
    LastName LIKE 'Joh%' OR
    FirstName LIKE 'Smi%' OR
    LastName LIKE 'Smi%'

If the following names were in the database and our user-query was "Joh Smi" the names should appear in the following order (or similar)

  1. John Smith
  2. Johnny Smith
  3. John Jacob
  4. David Smithsonian
  5. Daniel Johnson

I'm hoping to get it to work similar to facebook's autocomplete friend-search.

So, how do I return the top n most relevant rows in SQL Server 2008?

+1  A: 

You need first to decide what you mean by relevance. Here is my list of relevance

  1. Both words are exactly first and last name
  2. Same as first but inverse order
  3. One is full word, one is substring
  4. Same as 3rd
  5. Both are substrings. Inside this you can rate by length of substring (more means good)
  6. One is substring

So i would recommend you to create function that will take FirstName and LastName and input string and return int, or float accoring to rules, then sort by it.

Andrey
Love the list. But what would the SQL query/function look like? :-)
David Murdoch
hey, it is almost 2 am here, i want to sleep :)
Andrey
haha. ok ok... I'll give you a break.
David Murdoch
+3  A: 

I recommend implementing Full Text Search (FTS) for two reasons:

  1. Simplify the query (shouldn't need the ORs, which'll perform poorly)
  2. Utilize the FTS ranking functionality - see this article
OMG Ponies
+1  A: 

As an addendum to OMG Ponies answer...

To get the best results from a full text search you might want to create an indexed view that concatenates the first and last name fields.

This will allow you to weight individual parts of the full name more precisely.

Example code as follows:

    CREATE VIEW [dbo].[vFullname] WITH SCHEMABINDING AS
    SELECT personID, FirstName + ' ' + LastName AS name 
    FROM dbo.person 

    WITH ranks AS(
    SELECT FT_TBL.personid 
        ,FT_TBL.name
        ,KEY_TBL.RANK
    FROM dbo.vfullname AS FT_TBL 
        INNER JOIN CONTAINSTABLE(vfullname, (name),
    'ISABOUT ("Smith" WEIGHT (0.4), "Smi*" WEIGHT (0.2),
"John" WEIGHT (0.3), "Joh*" WEIGHT (0.1))') AS KEY_TBL  
          ON FT_TBL.personid = KEY_TBL.[KEY]
    ) 
    SELECT
    r.personID,
    p.firstname,
    p.lastname,
    r.rank
    FROM ranks r INNER JOIN
    person p ON r.personID = p.personID
    ORDER BY rank DESC;

The CTE just allows you to return the individual firstname and lastname fields. If you don't need these as an output then ignore it.

melkisadek