views:

430

answers:

11

Hey guys,

i'm running into a strange problem in Microsoft SQL Server 2008. I have a large database (20 GB) with about 10 tables and i'm attempting to make a point regarding how to correctly create indexes.

Here's my problem: on some nested queries i'm getting faster results without using indexes! It's close (one or two seconds), but in some cases using no indexes at all seems to make these queries run faster... I'm running a Checkpoiunt and a DBCC dropcleanbuffers to reset the caches before running the scripts, so I'm kinda lost.

What could be causing this? I know for a fact that the indexes are poorly constructed (think one index per relevant field), the whole point is to prove the importance of constructing them correctly, but it should never be slower than having no indexes at all, right?

EDIT: here's one of the guilty queries:

SET STATISTICS TIME ON
SET STATISTICS IO ON

USE DBX;
GO
CHECKPOINT;
GO
DBCC DROPCLEANBUFFERS;
GO
DBCC FREEPROCCACHE;
GO

SELECT * FROM Identifier where CarId in (SELECT CarID from Car where ManufactId = 14) and DataTypeId = 1

Identifier table: - IdentifierId int not null - CarId int not null - DataTypeId int not null - Alias nvarchar(300)

Car table: - CarId int not null - ManufactId int not null - (several fields followed, all nvarchar(100)

Each of these bullet points has an index, along with some indexes that simultaneously store two of them at a time (e.g. CarId and DataTypeId).

Finally, The identifier table has over million entries, while the Car table has two or three million

A: 

Check the execution plan to see if it is using one of these indexes that you "know" to be bad?

Generally, indexing slows down writing data and can help to speed up reading data.

So yes, I agree with you. It should never be slower than having no indexes at all.

Robin Day
+1  A: 

Normally SQL Server does a good job at deciding what index to use if any to retrieve the data in the fastest way. Quite often it will decide not to use any indexes as it can retrieve small amounts of data from small tables quicker without going away to the index (in some situations).

It sounds like in your case SQL may not be taking the most optimum route. Having lots of badly created indexes may be causing it to pick the wrong routes to get to the data.

I would suggest viewing the query plan in management studio to check what indexes its using, and where the time is being taken. This should give you a good idea where to start.

Another note is it maybe that these indexes have gotten fragmented over time and are now not performing to their best, it maybe worth checking this and rebuilding some of them if needed.

Gavin Draper
+1  A: 

Indexes don't really have any benefit until you have many records. I say many because I don't really know what that tipping over point is...It depends on the specific application and circumstances.

It does take time for the SQL Server to work with an index. If that time exceeds the benefit...This would especially be true in subqueries, where a small difference would be multiplied.

If it works better without the index, leave out the index.

Robert Harvey
For SQL 2000 it was around 134 rows. It isn't that many.
gbn
+1  A: 

Try DBCC FREEPROCCACHE to clear the execution plan cache as well.

gbn
@gbn+1: Most sensible comment so far. A difference of 1 second could be the compilation cost of the initial query :-) I would also suggest that you (SET STATISTICS IO ON) on to review the number of logical and physical reads and also (SET STATISTICS TIME ON), to accurately monitor time.
John Sansom
Added to the query, just waiting for the recovery.... damn, lousy timing :S
Hal
+1  A: 

This is an empty guess. Maybe if you have a lot of indexes, SQL Server is spending time on analyzing and picking one, and then rejecting all of them. If you had no indexes, the engine wouldn't have to waste it's time with this vetting process.

How long this vetting process actually takes, I have no idea.

Mark Canlas
+1  A: 

For some queries, it is faster to read directly from the table (clustered index scan), than it is to read the index and fetch records from the table (index scan + bookmark lookup).

Consider that a record lives along with other records in a datapage. Datapage is the basic unit of IO. If the table is read directly, you could get 10 records for the cost of 1 IO. If the index is read directly, and then records are fetched from the table, you must pay 1 IO per record.

Generally SQL server is very good at picking the best way to access a table (direct vs index). There may be something in your query that is blinding the optimizer. Query hints can instruct the optimizer to use an index when it is wrong to do so. Join hints can alter the order or method of access of a table. Table Variables are considered to have 0 records by the optimizer, so if you have a large Table Variable - the optimizer may choose a bad plan.

One more thing to look out for - varchar vs nvarchar. Make sure all parameters are of the same type as the target columns. There's a case where SQL Server will convert the whole index to the parameter's type in the event of a type mismatch.

David B
Hmmm I see, although i'm not using neither different types of columns nor Table Variables. Thanks for the insight though
Hal
A: 

SQL server actually makes some indexes for you (e.g. on primary key).

Indexes can become fragmented.

Too many indexes will always reduce performance (there are FAQs on why not to index every col in the db)

also there are some situations where indexes will always be slower.

Colin Pickard
Killed the primary keys and the indexes were "freshly built" :S
Hal
A: 

run:

SET SHOWPLAN_ALL ON

and then run your query with and without the index usage, this will let you see what index if any are being used, where the "work" is going on etc.

KM
+3  A: 

My guess would be that SQL Server is incorrectly deciding to use an index, which is then forcing a bookmark lookup*. Usually when this happens (the incorrect use of an index) it's because the statistics on the table are incorrect.

This can especially happen if you've just loaded large amounts of data into one or more of the tables. Or, it could be that SQL Server is just screwing up. It's pretty rare that this happens (I can count on one hand the times I've had to force index use over a 15 year career with SQL Server), but the optimizer is not perfect.

* A bookmark lookup is when SQL Server finds a row that it needs on an index, but then has to go to the actual data pages to retrieve additional columns that are not in the index. If your result set returns a lot of rows this can be costly and clustered index scans can result in better performance.

One way to get rid of bookmark lookups is to use covering indexes - an index which has the filtering columns first, but then also includes any other columns which you would need in the "covered" query. For example:

SELECT
     my_string1,
     my_string2
FROM
     My_Table
WHERE
     my_date > '2000-01-01'

covering index would be (my_date, my_string1, my_string2)

Tom H.
Thought of that, although no insertion operations were performed since the creation of the indexes
Hal
Regarding covered indexes, if i search for my_string1 and my_string2 can that covered index provide the answer?
Hal
It will sometimes use the index, although with my_string1 and my_string2 not being at the beginning of the index it will have to be an index scan. Imagine looking in the phone book for people who have "a" as the SECOND letter of their last name. It's faster to jump to each part which might have that ("aa", "ba", etc.) rather than scan the whole phone book, but it's not as fast as looking for names by first letter.
Tom H.
A: 

No Sql Server analyzes both the indexes and the statistics before deciding to use an index to speed up a query. It is entirely possible that running a non-indexed version is faster than an indexed version.

A few things to try

  1. ensure the indexes are created and rebuilt, and re-organized (defragmented).

  2. ensure that the auto create statistics is turned on.

  3. Try using Sql Profiler to capture a tuning profile and then using the Database Engine Tuning Advisor to create your indexes.

Surprisingly the MS Press Examination book for Sql administration explains indexes and statistics pretty well.

See Chapter 4 table of contents in this amazon reader preview of the book

Amazon Reader of Sql 2008 MCTS Exam Book

Peter
A: 

To me it sounds like your sql is written very poorly and thus not utilizing the indexes that you are creating.

you can add indexes till you're blue in the face but if your queries aren't optimized to use those indexes then you won't get any performance gain.

give us a sample of the queries you're using.

alright...

try this and see if you get any performance gains (with the pk indexes)

SELECT i.* 
FROM Identifier i 
    inner join Car c
     on i.CarID=c.CarID
where c.ManufactId = 14 and i.DataTypeId = 1
DForck42
Done, check it out
Hal
Will do, still waiting for that recovery... Gah
Hal