views:

598

answers:

4

Background: I have a table with 5 million address entries which I'd like to search for different fields (customer name, contact name, zip, city, phone, ...), up to 8 fields. The data is pretty stable, maximum 50 changes a day, so almost only read access.

The user isn't supposed to tell me in advance what he's searching for, and I also want support of combined search (AND-concatenation of search terms). For example "lincoln+lond" should search for all records containing both search terms in any of the search fields, also those entries starting with any of the terms (like "London" in this example).

Problem: Now I need to choose an indexing strategy for this search table. (As a side note: I'm trying to achieve sub-second response time, worst response time should be 2 seconds.) What's better in terms of perfomance:

  1. Do a combined index out of all queryable columns (would need 2 of them, as index limit of 900 bytes reached)
  2. Put single indexes on each of the queryable columns
  3. Make a fulltext index on the queryable columns and use fulltext query

I'm discarding point 1, as it doesn't seem to have any advantage (index usage will be limited and there will be no "index seek", because not all fields fit in one single index).

Question: Now, should I use the multiple single indexes variant or should I go with the fulltext index? Is there any other way to achieve the functionality mentioned above?

+1  A: 

Try them both and see which is faster on your system. There are few hard and fast rules for database optimizations, it really depends on your environment.

Jonathan Allen
A: 

I agree with Grauenwolf, and I'd like to add a note about indexes. Keep in mind that if you use a syntax like the following:

SELECT field1, field2, field3
FROM table
WHERE field1 LIKE '%value%

Then no index will be used anyway when searching on field1 and you have to resort to a full-text index. For the sake of completeness, the above syntax returns all rows where field1 contains value (not necessarily at the beginning). If you have to search for "contains", a full-text index is probably more appropriate.

Diego
+1  A: 

To answer my own question:

I've chosen the "multiple single indexes" option. I ended having an index for each of the queried columns, each index containing only the column itself. The search works very good with mostly subsecond response times. Sometimes it takes up to 2-3 seconds, but I'm attributing it to my database server (several years old laptop with 3GB Ram and slow disk).

I didn't test the fulltext option as it was not anymore necessary (and I don't have the time to do it.)

MicSim
+1  A: 

Originally, i was about to suggest going with FTS as that has a lot of strong performance features going for it. Especially when you dealing with varied queries. (eg. x AND y. x NEAR y, etc..).

But before I start to ramble on with the pro's of FTS, I just checked your server version -> sql2000.

poor thing. FTS was very simple back then, so stick with multiple single indexes.

We use Sql2008 and ... it rocks.

Oh, btw. did you know that Sql2008 (free edition) has FTS in it? Is it possible to upgrade?

Going from sql2000 -> sql2008 is very worth it, if you can.

But yeah, stick with your M.S.I. option.

Pure.Krome