views:

377

answers:

1

I have 3 tables that link up. Restaurants, Cuisines and Cuisine Type. A Restaurant can sell many cuisines of food (that's bad wording but you get idea?)

So I have Full text setup on Restaurant:Name, CityTown, Postcode and on CuisineType:Name

I have one searchbox on my home page and as the user types results are filtered to best match.

Here are a few examples: SearchText= "Royal D Ed" Now there is a row in Restaurant for a place named Royal Dynasty and the town is Edinburgh. But my top result back is for somewhere wher town starts with D aswell as name. This is not the best match.

I will show you my stored proc, prototype so deriveed names are a bit hackish.

ALTER PROCEDURE [dbo].[RestaurantsFullText]
    @searchText nvarchar(255) 
AS
SELECT 
    b.*, 
    COALESCE(akt2.[Rank],0) / 30 + 
    COALESCE(akt1.[Rank],0) / 30 + 
    COALESCE(akt.[Rank],0) / 30 + 
    COALESCE(bkt.[Rank],0)  as rankCount  
FROM 
    Restaurants b 
left JOIN Cuisines c on b.Id = c.RestaurantId 
left join CuisineType a 
ON c.CuisineId = a.id 

left JOIN 
containstable(Restaurants, Name, @searchText) bkt 
ON b.id = bkt.[Key] 

left JOIN containstable(CuisineType, Name, @searchText) akt 
ON a.id = akt.[Key]

left JOIN containstable(Restaurants, Postcode, @searchText) akt1 
ON b.id = akt1.[Key]

left JOIN containstable(Restaurants, citytown, @searchText) akt2 
ON b.id = akt2.[Key]

where 
    COALESCE(akt2.[Rank],0) / 30 + 
    COALESCE(akt1.[Rank],0) /30 + 
    COALESCE(akt.[Rank],0) / 30 + 
    COALESCE(bkt.[Rank],0)  > 5
ORDER BY 
    COALESCE(akt2.[Rank],0) / 30 + 
    COALESCE(akt1.[Rank],0) / 30 + 
    COALESCE(akt.[Rank],0) / 30 + 
    COALESCE(bkt.[Rank],0)   asc

I think the problem lies in the joins and the way rank is calculated.

I want it so that if I pass in "Royal Dynasty Edinburgh d" then Royal Dynasty is still the best match.

Since it is a filter then matches for towns in other cities starting with d should not be returned.

I would be extremely greatrful for help on this.

+1  A: 

Quick and easy first step

I would parse the string (if using SQL Server 2008 there is an inbuilt function remove the noise words if possible) and build up a boolean type search string

"royal" and "d*" and "ed*" that you could run over your existing query. If it doesn't return any results compound your data into a single view or table and FTS that rather than all the separate tables as it may be that ED isn't found on the Restaurant table and is therefore ignored by the and.

If that doesn't get you close enough. If it is live I would stop and look at the searches people are doing and the data they are looking at more closely work to improve those results. If that isn't possible; i would build that feature in before changing the search as knowing how people are using the software before you change it is the best place to start imo. There may be an interface way of improving things; e.g. be better to address the issue in the interface by offering people the Royal Dynasty as an option they type in royal d into the search box.

As I don't know how much people search for restaurant name, cuisine or location in your app this is quite general. Depending on what you see people are searching from your logs then I would go about things differently. In this sort of scenario i would build my search more dynamically rather than a single pass query as coded (though it will take more processing power per search so you may need to consider that.)

Consider either a deep search into the data or or thesaurus based word splitter that uses some of the knowledge of your data to prioritize results. Rather than pumping the whole string into the lookup and trying to find Royal D in your Cuisine list it would be better to find it in your Restaurants list first as a high match and then remove it from your search string; then search on all the matches from restaurants in location / cuisine against the reduced search string. That way the searches into cuisine and location data are based on restaurant matches (so should be quicker as the dataset is reduced) and you are searching with less data; the postcode string ED possibly.

Offering hints on the search results is a good thing; if search is very quick users often will search more than once to get better results so on the results screen offering Did you mean Edinburgh will again help you improve your searching.

u07ch
Thanks for the reply. I generate in my .net controller the searchtext to build the boolean string. I think maybe it would be best to search on name and strip out any words that match, then on town etc etc You can view the prototype at http://www.time2munch.com the most searched queries are normally of the type: "partial restaurant name" followed by "partial town or postcode" or "cuisine type" followed by "town or postcode"
dean nolan
had a quick play this morning; good luck :)
u07ch
Thanks, I have updated the design slightly (I'm no designer so doing little bits at a time ;) ) I think I will need to spend the next few days on other fatures as the search and filter by postcode are all doing my head in ha. I'm no good with SQL at all
dean nolan