views:

49

answers:

3

Hi,

Ive been working all day on optimizing a SQL Server 2000 db table with ~9million rows. My only db experience has been with tables with a few hundred rows, so I've never really had to deal with optimization.

I am doing selects and updates based on a 21 digit number.

Using an indexed char(21) type, the queries take more then 2 seconds and the SQL Server process takes 1.5gigs of ram and 100% cpu.

With a indexed bigint type my queries take a few milliseconds and the process takes ~100MB of ram.

I just want to understand what is happening here, is this normal, or is there a specific way a char type can be indexed for better performance?

Heres some of my sql:

CREATE TABLE data
(
    tableID int NOT NULL IDENTITY PRIMARY KEY,
    tag char(21) NOT NULL UNIQUE,
    dataColumn1 datetime NULL,
    dataColumn2 char(8) NULL,
    lastModified datetime NOT NULL
)

Parameterized query from c#:

SELECT tag FROM data WHERE tag = @tag;

Thanks for any help.

+1  A: 

This really isn't unusual, SQL handles numbers much better than characters. A bigInt field uses 8 bytes, which fits neatly into a memory page. A char field takes 21 bytes which almost triples the amount of storage to put it in an index..

Another consideration, is either index clustered? The clustered index will perform much faster than a non-clustered index. There are a lot of additional factors to consider, beyond the simple, general statement that numbers will perform better and use less space in the index.

Sparky
Makes sense that it will be slower, expected it to be only slightly slower though.I have a primary key index as well as an index on the column in question. Both are non-clustered since I only select single rows. My understanding is that clustered indexes improve performance on range queries only.Since I have alot of updates as well, I think that clustered queries would take a large disk i/o performance hit.
Matt
A: 

My money's on the possibility that you're trying to query on some function of the char column, i.e. something like:

SELECT Column
FROM Table
WHERE UPPER(CharColumn) = 'ABC'

Obviously I'm just guessing, but this is a common source of performance issues with char/varchar/nvarchar columns. If you're writing queries like this, you have to realize that wrapping the column in a function prevents SQL Server from performing an index seek.

Also take into account the number of rows you're actually returning; 9 million rows isn't necessarily a lot to have in the table, but even 100,000 rows is a huge amount to have in a result set. Sometimes the bottleneck is just streaming all the results.

If none of this describes your problem, then I suggest updating your post with some information about the schema, the indexes you have, and the query that's running slow. There are further optimizations you can do, although SQL Server 2000 is getting on in years, so your options are limited.


Based on the schema you posted, assuming you have an index on the tag column, that you are only selecting tag and no other columns that might not be covered, that your WHERE condition is a simple equality on the tag column, and that the number of results returned is reasonably small, I don't think that there are any further optimizations you can do. You've pretty much reduced this to the simplest possible case.

Aaronaught
I am selecting and updating single rows directly on the column without using any function. I have two non-clustered indexes on the table in question. One primary key and one plain index on the 21 digit column.
Matt
More specifics would still be helpful. I can't tell whether or not the index is a covering index, and sometimes having a lot of indexes (or large ones) can slow down update operations quite a bit. I think I could give you a better answer with specific info about the schema, indexes and query (anonymize it if you want).
Aaronaught
also, be aware that a CHAR(21) field is padded with spaces to its length, e.g. you won't find a field with "ABC" in it - rather, it'll contain "ABC..............................." (where . represents a space here)
marc_s
+1  A: 

Character comparisons are somewhat slower - the collation sequence has to be considered - not to mention the physical difference in size between a 21-character string and a bigint (8 bytes). The index seek just can't be as efficient because it has to evaluate each byte in your char(21) values, decide what the character's sort order is, then decide how that compares to the matching character in the value you're looking for.

A clustered index will perform better for almost any query because the data (including the index pages iirc; I'm not a DBA) are in disk-seek order. Or at least closer to it.

DaveE
Actually, a clustered index *scan* will perform slower than a regular index scan; that's why I was hoping for more specifics.
Aaronaught