views:

288

answers:

2

This query is PAINFULLY slow and our team cannot figure out why. We've tried creating views, but it still extremely slow. Any thoughts?

SELECT 
    CI . CWARCASNBR AS CASENUMBER , 
    CI . CT1FYA AS COURTAGENCYCODE , 
    CI . CIPTYSQNBR AS PARTYSEQNBR , 
    CI . CIRCDTYPE AS CASETYPECODE , 
    CP . NMELASTBUS AS LASTNAME , 
    CP . NAME_FIRST AS FIRSTNAME , 
    CP . NAME_MID AS MIDDLENAME , 
    CP . NAME_SUFFX AS SUFFIX , 
    CP . CP_SEX AS GENDER , 
    CP . CT1PA AS RACECODE , 
    CP . CP_DOB AS DOB , 
    CP . CP_SSN AS SSN , 
    A . STREETNAME AS ADDRESS1 , 
    A . ADDRLINE2 AS ADDRESS2 , 
    A . CITYPARISH AS CITY , 
    A . ADDRSTATE AS STATE , 
    A . ZIPCODE AS ZIP 
FROM 
    CMSDPL23 . JE026001 AS CP 
  LEFT OUTER JOIN 
    CMSDPR23 . JE215000 CI ON 
    CP . JEBOA = CI . CWARCASNBR AND 
    CP . CT1FYA = CI . CT1FYA AND 
    CP . CP_SEQ_NBR = CI . CIPTYSQNBR 
  LEFT OUTER JOIN 
    CMSDPR23 . CT007000 A ON CP . ADDRESSID = A . ADDRESSID 
                         AND CP . ADDRESSPRI = A . ADDIDSEQNO 
WHERE 
    CP . NMELASTBUS LIKE 'Durham' || '%' AND 
    CP . NAME_FIRST LIKE 'Roger%' || '%' AND 
    NOT CP . PRTY_TCDE IN ( 'OFF' , 'BEP' ) AND 
    CI . CI_FLAG_1 IN ( 'C' , 'B' ) AND 
    CI . CT1MKA = '23' 
ORDER BY 
    CI . CWARCASNBR , CI . CT1FYA ; 
+4  A: 

For starters, are all foreign key relationships indexed? (e.g., CMSDPR23.JE215000, CP.JEBOA, etc.

Second, LIKE forces a full table search. Can you index NMELASTBUS and NAME_FIRST (etc...) and check for matches?

Third, are fields in your WHERE clause indexed?

Bob Kaufman
Should I put an index on all JOIN fields as well as WHERE fields?
mint
@monO - In my experience, any time a field participates in a lookup that could potentially involve traversing the entire table, it should be indexed. This includes fields in WHERE clauses and JOINS. Usually, one of the fields in the JOIN is already indexed for you, being the primary key of the "other" table.
Bob Kaufman
+1  A: 

If you haven't done so already, try submitting the query to DB2's EXPLAIN utility to determine what the full access path is and which parts of the query are the most expensive. Any part of the explain plan that uses a relation scan (full table scan) to find rows is the most likely to be improved by an index.

Before you add a bunch of indexes, make sure the tables and indexes involved have have accurate statistics for the optimizer to use. If the table has grown substantially since RUNSTATS was last run, the optimizer may be disregarding perfectly good indexes because it doesn't understand how large the tables have grown. Execute a new RUNSTATS if the cardinality and distribution of the data has changed significantly from what was captured during the last RUNSTATS.

Posting the list of indexes that are already defined on the tables, along with the approximate number of rows in each table would help a lot.

A LIKE search does not necessarily force a table scan, but it certainly can result in an index scan if the column specified is indexed. The EXPLAIN utility will show you what is actually happening in those situations.

A foreign key does not always benefit from an index, especially for foreign keys that have very low cardinality throughout the table. Another issue is that the optimizer generally has to select the best index to use, so having a lot of sub-optimal indexes laying around will eventually slow down updates and may not accelerate reads all that much.

Let's assume that no good indexes exist yet on these tables. From the limited information provided, an index built on ( CWARCASNBR, CIPTYSQNBR, CT1FYA ) for table CMSDPR23.JE215000 may reduce the expense of the join from CMSDPL23.JE026001. Similarly, there is hopefully an index already built on (ADDRESSID, ADDIDSEQNO ) for CMSDPR23.CT007000, since that smells like a primary key or at least a unique candidate key.

Your ORDER BY is going to require a sort if a significant number of rows are returned. You may have a cheaper sort if you go after the same columns CP.JEBOA, CP.CT1FYA in the outer table, since it's only going to be scanned once.

Fred Sobotka