views:

500

answers:

1

Hi,

I'm looking for a good solution to use the containstable feature of the SQL Serve r2005 effectivly. Currently I have, e.g. an Employee and an Address table.

-Employee
Id
Name

-Address
Id
Street
City
EmployeeId

Now the user can enter search terms in only one textbox and I want this terms to be split and search with an "AND" operator. FREETEXTTABLE seems to work with "OR" automatically.

Now lets say the user entered "John Hamburg". This means he wants to find John in Hamburg. So this is "John AND Hamburg".

So the following will contain no results since CONTAINSTABLE checks every column for "John AND Hamburg".

So my question is: What is the best way to perform a fulltext search with AND operators across multiple columns/tables?

SELECT *
FROM Employee emp
    INNER JOIN 
        CONTAINSTABLE(Employee, *, '(JOHN  AND Hamburg)', 1000) AS keyTblSp
        ON sp.ServiceProviderId = keyTblSp.[KEY]    
    LEFT OUTER JOIN [Address] addr ON addr.EmployeeId = emp.EmployeeId
UNION ALL
SELECT *
FROM Employee emp 
    LEFT OUTER JOIN [Address] addr ON addr.EmployeeId = emp.EmployeeId
    INNER JOIN 
        CONTAINSTABLE([Address], *, '(JOHN  AND Hamburg)', 1000) AS keyTblAddr
        ON addr.AddressId = keyTblAddr.[KEY]    

...
+1  A: 

This is more of a syntax problem. How do you divine the user's intent with just one input box?

  • Are they looking for "John Hamburg" the person?
  • Are they looking for "John Hamburg Street"?
  • Are they looking for "John" who lives on "Hamburg Street" in Springfield?
  • Are they looking for "John" who lives in the city of "Hamburg"?

Without knowing the user's intent, the best you can hope for is to OR the terms, and take the highest ranking hits.

Otherwise, you need to program in a ton of logic, depending on the number of words passed in:

2 words:

Search Employee data for term 1, Search Employee data for term 2, Search Address data for term 1, Search address data for term 2. Merge results by term, order by most hits.

3 words:

Search Employee data for term 1, Search Employee data for term 2, Search employee data for term 3, Search Address data for term 1, Search address data for term 2, Search address data for term 3. Merge results by term, order by most hits.

etc...

I guess I would redesign the GUI to separate the input into Name and Address, at a minimum. If that is not possible, enforce a syntax rule to the effect "First words will be considered a name until a comma appears, any words after that will be considered addresses"

EDIT:

Your best bet is still OR the terms, and take the highest ranking hits. Here's an example of that, and an example why this is not ideal without some pre-processing of the input to divine the user's intent:

insert into Employee (id, [name]) values (1, 'John Hamburg')
insert into Employee (id, [name]) values (2, 'John Smith')
insert into Employee (id, [name]) values (3, 'Bob Hamburg')
insert into Employee (id, [name]) values (4, 'Bob Smith')
insert into Employee (id, [name]) values (5, 'John Doe')

insert into Address (id, street, city, employeeid) values (1, 'Main St.', 'Springville', 1)
insert into Address (id, street, city, employeeid) values (2, 'Hamburg St.', 'Springville', 2)
insert into Address (id, street, city, employeeid) values (3, 'St. John Ave.', 'Springville', 3)
insert into Address (id, street, city, employeeid) values (4, '5th Ave.', 'Hamburg', 4)
insert into Address (id, street, city, employeeid) values (5, 'Oak Lane', 'Hamburg', 5)

Now since we don't know what keywords will apply to what table, we have to assume they could apply to either table, so we have to OR the terms against each table, UNION the results, Aggregate them, and compute the highest rank.

SELECT Id, [Name], Street, City, SUM([Rank])
FROM
(
    SELECT emp.Id, [Name], Street, City, [Rank]
    FROM Employee emp 
    JOIN [Address] addr ON emp.Id = addr.EmployeeId
    JOIN CONTAINSTABLE(Employee, *, 'JOHN OR Hamburg') AS keyTblEmp ON emp.Id = keyTblEmp.[KEY]

    UNION ALL

    SELECT emp.Id, [Name], Street, City, [Rank]
    FROM Employee emp 
    JOIN [Address] addr ON emp.Id = addr.EmployeeId
    JOIN CONTAINSTABLE([Address], *, 'JOHN OR Hamburg') AS keyTblAdd ON addr.Id = keyTblAdd.[KEY]   
) as tmp

GROUP BY Id, [Name], Street, City
ORDER BY SUM([Rank]) DESC

This is less than ideal, here's what you get for the example (in your case, you would have wanted John Doe from Hamburg to show up first):

Id       Name              Street            City           Rank
2        John Smith        Hamburg St.       Springville    112
3        Bob Hamburg       St. John Ave.     Springville    112
5        John Doe          Oak Lane          Hamburg        96
1        John Hamburg      Main St.          Springville    48
4        Bob Smith         5th Ave.          Hamburg        48

But that is the best you can do without parsing the input before submitting it to SQL to make a "best guess" at what the user wants.

GalacticJello
Thanks for your reply. I want to keep one input box because it should be easy for the user to search for something real quick like in Google. If the user searches for John Hamburg he wants to get results for John who e.g lives in HamburgJohn whos lastname might be Hamburg and who might live in Hamburg as wellBut no result forJust John who doesn't live in Hamburg.Or some other person that lives in Hamburg.The thing is, that later on more information will be searchable like E-Mails etc. So I need a containstable over all data at once with terms that are connected by AND.
Chris
Which gets back to syntax. If you don't know what table and column an input keyword should apply to, you can't create a "one-size-fits-all" AND statement without programming in a ton of logic. In your case above, how do you know to search the first keyword as a [Name], and the second keyword as a [City]? What if the user wanted the second keyword to be part of the name, or part of a street? Unless you have some syntax rules you haven't mentioned that says something to the effect "The first word will be a name, the second word will be a city"?
GalacticJello