ansaurus

Question

Answer 1

+4 A:

Optimising for Auto-complete

Unfortunately, the resolution of this issue will depend heavily on the data you are hoping to query.

LIKE queries will not put too much strain on your database, as long as you spend time using 'EXPLAIN' or the profiler to show you how the query optimiser plans to perform your query.

Some basics to keep in mind:

Indexes: Ensure that you have indexes setup. (Yes, in many cases LIKE does use the indexes. There is an excellent article on the topic at myitforum. SQL Performance - Indexes and the LIKE clause ).
Joins: Ensure your JOINs are in place and are optimized by the query planner. SQL Server Profiler can help with this. Look out for full index or full table scans

Auto-complete sub-sets

Auto-complete queries are a special case, in that they usually works as ever decreasing sub sets.

'name' LIKE 'a%' (may return 10000 records)
'name' LIKE 'al%' (may return 500 records)
'name' LIKE 'ala%' (may return 75 records)
'name' LIKE 'alan%' (may return 20 records)

If you return the entire resultset for query 1 then there is no need to hit the database again for the following result sets as they are a sub set of your original query.

Depending on your data, this may open a further opportunity for optimisation.

Jon Winstanley 2010-01-21 17:22:01

Thanks for your answer :)

Alfred 2010-01-24 19:58:06

Right now I think this the best answer for me. But I hope other people will tell there experience.

Alfred 2010-01-27 22:33:56

Answer 2

+5 A:

I will no comply with your requirements and obviously the numbers of scale will depend on hardware, size of the DB, architecture of the app, and several other items. You must test it yourself.

But I will tell you the method I've used with success:

Use a simple SQL like for example: SELECT name FROM users WHERE name LIKE al%. but use TOP 100 to limit the number of results.
Cache the results and maintain a list of terms that are cached
When a new request comes in, first check in the list if you have the term (or part of the term cached).
Keep in mind that your cached results are limited, some you may need to do a SQL query if the term remains valid at the end of the result (I mean valid if the latest result match with the term.

Hope it helps.

Eduardo Molteni 2010-01-21 17:24:05

Thanks it helps :-)

Alfred 2010-01-24 19:57:47

some other random advice, not worthy a full fledged answer: index the search column. start autocompletion at the second or third character. beware of flushing the cache when dataset changes, or update it somehow. if the output list is big enough, consider compressing it. if you aim for a really big site, consider a caching reverse proxy for this kind of semi static stuff (and remember to set the response header regarding cache timeout to something sane)

Lorenzo Boccaccia 2010-01-27 18:02:59

Answer 3

+1 A:

Using SQL versus Solr's terms component is really not a comparison. At their core they solve the problem the same way by making an index and then making simple calls to it.

What i would want to know is "what you are trying to auto complete".

Ultimately, the easiest and most surefire way to scale a system is to make a simple solution and then just scale the system by replicating data. Trying to cache calls or predict results just make things complicated, and don't get to the root of the problem (ie you can only take them so far, like if each request missed the cache).

Perhaps a little more info about how your data is structured and how you want to see it extracted would be helpful.

mlathe 2010-01-27 02:00:34

Say for example autocomplete name:[alfred,miathe,..,..]. What is your favorite solution.

Alfred 2010-01-27 08:36:53

if you have only a few hundred records, why not just do it on the client side.. just load it all, and use JS. If you are talking about millions of records then using something like a LIKE on a DB is probably the easiest way to get something working that will scale. But you are getting a lot of functionality in the DB that you aren't using.

mlathe 2010-01-27 21:36:06

ansaurus

tags:

views:

answers:

efficient serverside autocomplete

Optimising for Auto-complete

Auto-complete sub-sets

related questions