views:

83

answers:

1

Hi, I have below a working query that needs to be simplified.

The reason is that I need to expand it a lot to cover the real application. For each condition (Pos=xxx AND Indata=yyy) the current query doubles in size, and I have a lot of conditions. Also the ON clause will contain many more conditions than in the example....

The real application will have about 20 Pos/Indata conditions (only 3 here) and 20 other fields that must match (only 1 here).

What the query does, explained in "pseudocode":

find all rows with (Pos=xx1 AND Indata=yy1) as t1
union
all rows with (Pos=xx2 AND Indata=yy2) as t2
then join t1 and t2 to keep only rows where other fields match (t1.fields=t2.fields OR t1.fields=* OR t2.fields=*)
you could store this in a temp table temp1

then find all rows with (Pos=xx3 AND Indata=yy3) as t3
union
temp1
then join t3 and temp1 to keep only rows where other fields match (t3.fields=temp1.fields OR t3.fields=* OR temp1.fields=*)
you could store this in a temp table temp2

then find all rows with (Pos=xx4 AND Indata=yy4) as t4
union
temp2
join t4 and temp4 to keep only rows where other fields match (t4.fields=temp4.fields OR t4.fields=* OR temp4.fields=*)
you could store this in a temp table temp2

etc, etc....I basically want to find all rows table "codes" with certain Pos and Indata where most other fields match each other....a possible solution may be temp tables plus a bit of PHP as in the pseudocode above....but it would be neat to solve in SQL only....

SELECT t15.* FROM (
    (
        SELECT DISTINCT t6.* FROM (
            (
                SELECT t5.* FROM (
                    (
                        SELECT DISTINCT t1.* FROM (
                            (SELECT * FROM codes WHERE (Pos = 10 AND Indata = 'Rexroth')) AS t1
                            JOIN (
                                (SELECT * FROM codes WHERE (Pos = 30 AND Indata = '%Mineralolja')) AS t2
                            ) ON (t1.Manufacturer = t2.Manufacturer OR t1.Manufacturer='*' OR t2.Manufacturer='*')
                        )
                    ) UNION (
                        SELECT DISTINCT t4.* FROM (
                            (SELECT * FROM codes WHERE (Pos = 10 AND Indata = 'Rexroth')) AS t3
                            JOIN (
                                (SELECT * FROM codes WHERE (Pos = 30 AND Indata = '%Mineralolja')) AS t4
                            ) ON (t3.Manufacturer = t4.Manufacturer OR t3.Manufacturer='*' OR t4.Manufacturer='*')
                        )
                    )
                ) AS t5
            ) AS t6

            JOIN (
                (
                    SELECT * FROM codes WHERE (Pos = 70 AND Indata = '28 cm3')
                ) AS t7
            ) ON (t7.Manufacturer = t6.Manufacturer OR t7.Manufacturer='*' OR t6.Manufacturer='*')
        )
    ) UNION (
        SELECT DISTINCT t14.* FROM (
            (
                SELECT t12.* FROM (
                    (
                        SELECT DISTINCT t8.* FROM (
                            (SELECT * FROM codes WHERE (Pos = 10 AND Indata = 'Rexroth')) AS t8
                            JOIN (
                                (SELECT * FROM codes WHERE (Pos = 30 AND Indata = '%Mineralolja')) AS t9
                            ) ON (t9.Manufacturer = t8.Manufacturer OR t9.Manufacturer='*' OR t8.Manufacturer='*')
                        )
                    ) UNION (
                        SELECT DISTINCT t11.* FROM (
                            (SELECT * FROM codes WHERE (Pos = 10 AND Indata = 'Rexroth')) AS t10
                            JOIN (
                                (SELECT * FROM codes WHERE (Pos = 30 AND Indata = '%Mineralolja')) AS t11
                            ) ON (t11.Manufacturer = t10.Manufacturer OR t11.Manufacturer='*' OR t10.Manufacturer='*')
                        )
                    )
                ) AS t12
            ) AS t13
            JOIN (
                (
                    SELECT * FROM codes WHERE (Pos = 70 AND Indata = '28 cm3')
                ) AS t14
            ) ON (t13.Manufacturer = t14.Manufacturer OR t13.Manufacturer='*' OR t14.Manufacturer='*')
        )
    )
) AS t15

I just rewrote it as follows with temp tables, seems easier to read at least:

CREATE TEMPORARY TABLE temp
SELECT * FROM codes WHERE (Pos = 10 AND Indata = 'Rexroth');

CREATE TEMPORARY TABLE temp2
SELECT DISTINCT t1.* FROM (
    (SELECT * FROM temp) AS t1
    JOIN (
        (SELECT * FROM codes WHERE (Pos = 30 AND Indata = '%Mineralolja')) AS t2
    ) ON (t1.Manufacturer = t2.Manufacturer OR t1.Manufacturer='*' OR t2.Manufacturer='*')
);
INSERT INTO temp2
SELECT DISTINCT t2.* FROM (
    (SELECT * FROM temp) AS t1
    JOIN (
        (SELECT * FROM codes WHERE (Pos = 30 AND Indata = '%Mineralolja')) AS t2
    ) ON (t1.Manufacturer = t2.Manufacturer OR t1.Manufacturer='*' OR t2.Manufacturer='*')
);

DROP TABLE temp;
ALTER TABLE temp2 RENAME temp;

CREATE TEMPORARY TABLE temp2
SELECT DISTINCT t1.* FROM (
    (SELECT * FROM temp) AS t1
    JOIN (
        (SELECT * FROM codes WHERE (Pos = 70 AND Indata = '28 cm3')) AS t2
    ) ON (t1.Manufacturer = t2.Manufacturer OR t1.Manufacturer='*' OR t2.Manufacturer='*')
);
INSERT INTO temp2
SELECT DISTINCT t2.* FROM (
    (SELECT * FROM temp) AS t1
    JOIN (
        (SELECT * FROM codes WHERE (Pos = 70 AND Indata = '28 cm3')) AS t2
    ) ON (t1.Manufacturer = t2.Manufacturer OR t1.Manufacturer='*' OR t2.Manufacturer='*')
);

SELECT * FROM temp2;
+1  A: 

There's a lot to be said for a stored proc and temp tables here - as you yourself suggest.

For example your SQL above is getting quite hard to read even now: I look at it and immediately think "too long, really don't want to read" :-)

Procs and working tables may make life much easier for future developers to read and extend - yourself included in 6 months time!

You may also get performance benefits from procs and working tables: you can put indexes on the working tables, debug your proc one bit at a time to see where bottlenecks are, etc.

I guess I'm recommending: favour readability and extensibility over "elegance" of using a single query, if performance can be considered more or less equivalent in either approach.

Brian
Thanks Brian, I was hoping I missed some obvious "smart" simple solution, but perhaps I am not.....the query itself will be generated by PHP code and that code will hopefully be easier to read than a massive query as above (and 100 times worse in th real application!). Is there a simple way to generate a temp table from en existing table, or do I have to specify all the fields etc etc?
Petter Magnusson
Also, have you got any ideas how stored procs may help here?
Petter Magnusson
Temp tables can be generated automatically based on the resultset of a SELECT but it depends on your DB platform. E.g. you might be able to do:INSERT INTO #working SELECT field1, field2, field3 FROM REAL_TABLE WHERE condition = 1where #working is created "on the fly".If you're generating SQL from PHP then stored procs may not an option but temp/working tables still could be. Essentially you want to avoid(a) massive queries that cause slow performance, and (b) lots of data communication between your app and DB servers.For your app, (a) could be a particular worry?
Brian
Temp tables worked fine, but the conditions in my examples were not enough, I had to use a more complex approach with quite complicated joins etc to cover all cases...
Petter Magnusson