views:

92

answers:

6

I have two tables - incoming tours(id,name) and incoming_tours_cities(id_parrent, id_city)

id in first table is unique, and for each unique row from first table there is the list of id_city - s in second table(i.e. id_parrent in second table is equal to id from first table)

For example

incoming_tours

|--id--|------name-----|
|---1--|---first_tour--|
|---2--|--second_tour--|
|---3--|--thirth_tour--|
|---4--|--hourth_tour--|

incoming_tours_cities

|-id_parrent-|-id_city-|
|------1-----|---4-----|
|------1-----|---5-----|
|------1-----|---27----|
|------1-----|---74----|
|------2-----|---1-----|
|------2-----|---5-----|
........................

That means that first_tour has list of cities - ("4","5","27","74")

AND second_tour has list of cities - ("1","5")


Let's assume i have two values - 4 and 74:

Now, i need to get all rows from first table, where my both values are in the list of cities. i.e it must return only the first_tour (because 4 and 74 are in it's list of cities)

So, i wrote the following query

SELECT t.name
FROM `incoming_tours` t
JOIN `incoming_tours_cities` tc0 ON tc0.id_parrent = t.id
AND tc0.id_city = '4'
JOIN `incoming_tours_cities` tc1 ON tc1.id_parrent = t.id
AND tc1.id_city = '74'

And that works fine.

But i generate the query dynamically, and when the count of joins is big (about 15) the query slowing down.

i.e. when i try to run

SELECT t.name
FROM `incoming_tours` t
JOIN `incoming_tours_cities` tc0 ON tc0.id_parrent = t.id
AND tc0.id_city = '4'
JOIN `incoming_tours_cities` tc1 ON tc1.id_parrent = t.id
AND tc1.id_city = '74'
.........................................................
JOIN `incoming_tours_cities` tc15 ON tc15.id_parrent = t.id
AND tc15.id_city = 'some_value'

the query run's in 45s(despite on i set indexes in the tables)

What can i do, to optimaze it?

Thanks much

A: 

Just an hint. If you use the IN operator in a WHERE clause, you can hope that the short-circuit of operator AND may remove unnecessary JOINs during the execution for the tours that do not respect the constraint.

Luca Martini
A: 

Seems like an odd way to do that query, here

SELECT t.name FROM `incoming_tours` as t WHERE t.id IN (SELECT id_parrent FROM `incoming_tours_cities` as tc WHERE tc.id_city IN ('4','74'));

I think that does it, but not tested...

EDIT: Added table alias to sub-query

pharalia
this will not work, because i need **all** values to match, but when you wrote `in` it is equal to `or`, not `and` statement verification.
Syom
i.e. after runing your query with values of (`"5"`,`"74"`) you will get both `first_tour` and `second_tour`, but we need the first only.
Syom
I ran my query after re-creating the tables from the example, and it returns first_tour only.
pharalia
Try it with values "4" and "5" (where 5 is in the second_tour as well as the first) and then you will see that this query will not work
Catch22
Yeah, noticed that after I posted my comment
pharalia
+1  A: 

I'm pretty sure this works, but a lot less sure that it is optimal.

SELECT * FROM incoming_tours 
WHERE 
id IN (SELECT id_parrent FROM incoming_tours_cities WHERE id_city=4)
AND id IN (SELECT id_parrent FROM incoming_tours_cities WHERE id_city=74)
...
AND id IN (SELECT id_parrent FROM incoming_tours_cities WHERE id_city=some_value)
Aether
i've tested it already, it's mach slower than with joins
Syom
I've now had a look at the query execution times, too. It doesn't seem to be a lot slower at lower values of conditions, but it does seem to be much quicker to have many `SELECT` subqueries than to have the same number of `JOIN` conditions for higher numbers of conditions.
Aether
+5  A: 
SELECT t.name
FROM incoming_tours t INNER JOIN 
  ( SELECT id_parrent
    FROM incoming_tours_cities
    WHERE id IN (4, 74)
    GROUP BY id_parrent
    HAVING count(id_city) = 2) resultset 
  ON resultset.id_parrent = t.id

But you need to change number of total cities count.

ksogor
please explain meaning of - ` HAVING count(id) = 2) resultset `
Syom
You take only the parent_ids that have matched your list of cities twice. That way you know that they have both cities.
jhominal
Oh, and replace `id` with `id_city`
jhominal
let me test it...
Syom
EXCELLENT. Thanks much man;)
Syom
jhominal, thanks )
ksogor
+2  A: 
SELECT name
FROM (
      SELECT DISTINCT(incoming_tours.name) AS name,
             COUNT(incoming_tours_cities.id_city) AS c
      FROM incoming_tours
           JOIN incoming_tours_cities
                ON incoming_tours.id=incoming_tours_cities.id_parrent
      WHERE incoming_tours_cities.id_city IN(4,74)
            HAVING c=2
      ) t1;

You will have to change c=2 to whatever the count of id_city you are searching is, but since you generate the query dynamically, that shouldn't be a problem.

Narf
I'm not sure this is correct. Don't you need a `GROUPBY`?
Luca Martini
Doesn't seem like it - I've tested it and it works just fine.
Narf
A: 

I've written this query using CTE's and it includes the test data in the query. You'll need to modify it so that it queries the real tables instead. Not sure how it performs on a large dataset...

Declare @numCities int = 2

;with incoming_tours(id, name) AS
(
    select 1, 'first_tour' union all
    select 2, 'second_tour' union all
    select 3, 'third_tour' union all
    select 4, 'fourth_tour' 
)
, incoming_tours_cities(id_parent, id_city) AS
(
    select 1, 4 union all 
    select 1, 5 union all 
    select 1, 27 union all 
    select 1, 74 union all 
    select 2, 1 union all 
    select 2, 5
)
, cityIds(id_city) AS
( 
    select 4
    union all select 5
    /* Add all city ids you need to check in this table */
)
, common_cities(id_city, tour_id, tour_name) AS
(
    select c.id_city,  it.id, it.name
    from cityIds C, Incoming_tours_cities tc, incoming_tours it
    where C.id_city = tc.id_city
    and tc.id_parent = it.id
)
, tours_with_all_cities(id_city) As
(
    select tour_id from common_cities 
    group by tour_id 
    having COUNT(id_city) = @numCities
)
select it.name from incoming_tours it, tours_with_all_cities tic
where it.id = tic.id_city
Catch22