views:

897

answers:

6
SELECT pe.prodtree_element_id prodID, pe.prodtree_element_name_s, li.line_name, av2.value
    FROM prodtree_element pe
    LEFT JOIN prodtree_link pl
     ON pe.prodtree_element_id = pl.to_prodtree_node_id
    LEFT JOIN line li
     ON pe.line_code = li.line_code
    INNER JOIN attribute_values av
     ON av.attribute_definition_id = #statusCode# 
    LEFT JOIN attribute_values av2
     ON pe.prodtree_element_id = av.prodtree_element_id
    WHERE pe.prodtree_element_func_type <> 'WIZARD'
     AND pe.prodtree_element_topo_type = 'NODE'

"#statusCode#" is a static id that matches an id in the attribute definition table (let's say 22 for the sake of argument). The problem is, the query has some massive trouble finishing in any sane amount of time. The bigger problem is, I kinda need it to finish earlier, but the number of records is enormous that it has to draw back (around 30-50,000). I need data from multiple tables, which is where it starts to slow down. This is just a piece of what I need, I also need an entire other tables worth of data matching the current "prodtree_elment_id".

I'm using ColdFusion but even running the query directly in SQL Server 2005 creates the 15-30+ minute wait for this query (if it even finishes). Is there any conceivable way to speed up this query to take at most 5 minutes or less?

A: 

Without knowing the DDL its very hard to test. 30-50K rows should still take only a few seconds.

Try switching the where clause ordering. You should probably have this implemented

 INNER JOIN attribute_values av
        ON av.attribute_definition_id = #statusCode#

in the where clause.

StingyJack
A: 

First thing I would suggest is that you run it through the sql optimizer utility in enterprise manager providing you have that installed. It generally suggests indexes and things like that which could have a positive impact on the query speed.

Other things to consider would be splitting up the query. From initial glances it looks like you are reading all product elements which have a particular attribute matching the value you give (or something like that). I would suggest maybe :

select * from [bigLongjoin to producttree_element]
where prodtree_element_id
in(
select prodtree_element_id from 
  attribute_values where attribute_definition_id = #statusCode#)

Running it in enterprise manager with the query plan displayed might also show you where the bottle necks are

Jennifer
in usually makes things even slower...
mson
+8  A: 
INNER JOIN attribute_values av
    ON av.attribute_definition_id = #statusCode# 
LEFT JOIN attribute_values av2
    ON pe.prodtree_element_id = av.prodtree_element_id

This is the problem. There is a cross join between pe and av, followed by an outer join onto the cross join. You're lucky it only takes 30 mins :-)

I think you want this:

SELECT pe.prodtree_element_id prodID, pe.prodtree_element_name_s, li.line_name, av2.value
FROM prodtree_element pe
LEFT JOIN prodtree_link pl
    ON pe.prodtree_element_id = pl.to_prodtree_node_id
LEFT JOIN line li
    ON pe.line_code = li.line_code
--replacement
LEFT JOIN
attribute_values av 
         ON pe.prodtree_element_id = av.prodtree_element_id AND
         av.attribute_definition_id = #statusCode# 
--end replacement
WHERE pe.prodtree_element_func_type <> 'WIZARD'
    AND pe.prodtree_element_topo_type = 'NODE'
gbn
Thanks, my problem was that I didn't know you could do an AND inside an JOIN! Now that I've added that to the code (which I have to do elsewhere as well) it's sped up more than significantly enough.The AND is the key folks!
Organiccat
Is it necessary for that to be in the join section? I normally put that sort of thing in the WHERE. Does it increase performance?
Sam
Because filtering is needed on attribute_values. If you have this in the where clause then you stop the outer join working and restrict rows from prodtree_element to av.attribute_definition_id = x
gbn
A: 

You can search millions of records in a few seconds with good optimization. Though StingyJack is right in that without knowing the DDL, optimization of any query is tough.

Things to do though when optimizing a query though is look the execution plan. Nested loops and the like are bad. Also make sure you are fully indexed as well. You mention nothing of the indexes of the tables in question. Without indexes 30 - 50k rows could take a while with that many joins.

Joshua Hudson
A: 

make sure all your IDs are Indexes also if you can map the following:

pe.prodtree_element_func_type <> 'WIZARD'
        AND pe.prodtree_element_topo_type = 'NODE'

to be something like

pe.prodtree_element_func_type_ID <> 1
            AND pe.prodtree_element_topo_type_ID = 2

in order to reduce the String comparisons that takes more time to complete

Oscar Cabrero
A: 
SELECT pe.prodtree_element_id prodID, pe.prodtree_element_name_s, li.line_name, av2.value
    FROM prodtree_element pe
    LEFT JOIN prodtree_link pl
        ON (pe.prodtree_element_id = pl.to_prodtree_node_id)
    LEFT JOIN line li
        ON (pe.line_code = li.line_code)
    LEFT JOIN attribute_values av2
        ON (pe.prodtree_element_id IN (SELECT av.prodtree_element_id FROM attribute_values av WHERE av.attribute_definition_id = #statusCode#))
    WHERE pe.prodtree_element_func_type <> 'WIZARD'
        AND pe.prodtree_element_topo_type = 'NODE'

gbn nailed it I think. Even though you restrict the INNER JOIN to attribute_values to a specific value, it is still not joined at all to your primary table or its relationships. So even if you are getting results from the query, my guess is that there are too many.

Depending on what you intended and how your data is in the attribute_values table either his query or mine would probably be faster.

Will Rickards
The IN clause will kill it
gbn