views:

403

answers:

10

We log values and we only log them once in a table. When we add values to the table we have to do a look up everytime to see if it needs to insert the value or just grab the id. We have an index on the table (not on the primary key) but there are about 350,000 rows (so it is taking 10 seconds to do 10 of these values).

So either

  • We figure out a way to optimize it
  • Strip it this feature out or
  • Do something completely different when logging these values.
+2  A: 

Just to be clear, the index is on the (presumably varchar or nvarchar) field in the table, correct? Not the PK?

ok, after your edit: You're doing an indexed lookup on a large (n)varchar text field. Even with the index that can be pretty slow -- you're still doing 2 big string comparisons. I can't really thing of a great way to do this, but some initial SWAGS:

  • compute a hash of the to-be-logged text, and store that in the database for subsequent lookups
  • as another poster suggested, store all of the rows, and filter out dupes in the query (or with a nightly batch, whatever
  • don't check for duplicates. Catching an exception may still be cheaper than the lookup*
  • hire someone with a really good memory who's fast with a mouse. When a message is going to be logged, flash it to their screen with an accept/deny prompt. If the entry is a dupe, ahve them click "deny"


* yeah, I know I'll be down-modded for that, but sometimes pragmatism just works.

Danimal
I have updated my questions, the index is NOT on the pk, thank you for the fast response.
RyanKeeter
primary keys *always* have an associated unique index. We need more information. What is the DDL that you used to create the table? What DML are using to perform the search? Are you using the 'like' clause?
David Medinets
A: 

How frequently do you write to the table vs. reading from it. If you have frequent writes and occasional reads, consider just always doing inserts and then handle collapsing the values when doing a select.

If you're trying to put everything in one table, consider breaking them out into separate tables to cut down on size, or barring that use partitions on the table.

Ted Elliott
+1  A: 

It's taking 1 second to do an indexed lookup on a 350k-row table? That sounds really rather unnecessarily slow to me.. Are you sure there isn't something else wrong?

Dan
+1  A: 

Without seeing your actual queries I can only generalize. However, I'd offer the following ideas/advice:

1) Have you verified that your index is indeed being used for the lookup query? If it were an index with a high cardinality, it should be much faster.

2) You could combine the 2 operations into a single stored procedure which first looked for the row and then did an insert if necessary....something like:

IF EXISTS (SELECT ID FROM YourTable WHERE ID = @ID_to_look_for)
      @ID_exists = 1
ELSE
      @ID_exists = 0

If you post what the exact queries look like, maybe I can offer a more detailed answer.

tyshock
I'm guessing he's looking through the logged text, not IDs. That is, select @id = id from mytable where log_text = @text.
Danimal
A: 

I'm not sure I have enough informaiton to answer this, but here are some thoughts none the less:

  1. If you are not already doing so you may be able to do the insert and the verfication all in one SQL (insert into table (values) (select lefter outer join to table where id is null)
  2. Are you using a DAL layer, or stored procedures to do this? Do you control the SQL used to select/insert? If you don't you may want to user SQL Profiler to examine what is being sent to the DB incase it's format invalidates the index.
JPrescottSanders
+1  A: 

Instead of doing a lookup just try inserting the value. If the table is designed to refuse duplicate records, i.e. it has a primary key or unique index, then the insert will error. Simply trap for the insert error and if it is received then grab the id as you normally would.

I agree that the lookup should not be taking that long but why make the engine parse the query, map out a path, do the lookup and then send you the results before you insert when it could do both at the same time.

You could also look into:

  1. indexing better, assuming there is room for improvement
  2. Altering the physical layout of the database to improve IO
  3. Increasing the memory available to SQL Server
DL Redden
With a unique index, the database will always do its own select to determine if the record already exists. Therefore, you definitely should not perform the same select inside the application before an insert - unless performance is not important and you need avoid exceptions to control messages.
David Medinets
A: 

"When we add values to the table we have to do a look up everytime to see if it needs to insert the value or just grab the id."

We used to call this the "upsert" operation.

try:
    UPDATE log SET blah blah blah WHERE key = key;
except Missing Key:
    INSERT INTO log(...) VALUES(...);

We never did our own query to see if the key existed, since that's the job of the UPDATE statement.

S.Lott
+1  A: 

First of all, look at the query plan to see what it is doing. This will tell you if it is using the index. One second for a single row test/insert is too slow. For 350k rows this is long enough for it to do a table scan over a cached table.

Second. Look at the physical layout of your server. Do you have something like logs and data sharing the same disk?

Thirdly, check that the index columns on your unique key are in the same order as the predicate on the select query. Differences in order may confuse the query optimizer.

Fourthly, consider a clustered index on the unique key. If this is your main mode of looking up the row it will reduce the disk accesses as the table data is physically stored with the clustered indexes. See This for a blurb about clustered indexes. Set the table up with a generous fill factor.

Unless you have blob columns, 350k rows is way below the threshold where partitioning should make a difference. This size table should fit entirely in the cache.

ConcernedOfTunbridgeWells
A: 

Are you by chance using a cursor? It shouldn't take ten seconds on a table that small to do what you said you were doing.

You need set-based update and insert statements.

HLGEM
A: 
  1. Rule out connectivity and driver issues - ensure other operations on the same database performed in the same manner are fast enough

  2. Make sure you measure this operation independently from other ops that might be running within the same transaction

  3. Make sure you have no lock scenarios - stop everything else and just execute your lookup and update sequence from your management tool.

  4. Check if the lookup is more costly (99%) or the disk write is costly - though 10 secs is way too high even for a slow disk. Do this for the completeness sake.

  5. Check if your index is being used by the query - table scans might be happening.

  6. If the columns used for Index is a text field, check if the text indexing is at the root of the issue by issuing the lookups on a non text column which has an index on it. If so try to change the logic to use the PK or use a hash instead of the text.