I'm starting to get a much better grasp on PostgreSQL indexing, but I've run into an issue with the OR conditional, where I don't know how to go about optimizing my indexes for a faster query.
I have 6 conditionals that, when run individually, appear to have a small cost. Here's an example of the trimmed queries, including query plan calculated times.
(NOTE: I haven't output the actual query plans for these queries below for the sake of reducing complexity, but they all use nested loop left joins
and index scans
as I would expect with proper indexing. If necessary, I can include the query plans for a more meaningful response.)
EXPLAIN ANALYZE SELECT t1.*, t2.*, t3.*
FROM t1 LEFT JOIN t2 on t2.id = t1.t2_id LEFT JOIN t3 ON t3.id = t1.t3_id
WHERE (conditions1)
LIMIT 10;
QUERY PLAN
-------------------------------------------------------------------------------------
Limit (cost=0.25..46.69 rows=1 width=171) (actual time=0.031..0.031 rows=0 loops=1)
EXPLAIN ANALYZE SELECT t1.*, t2.*, t3.*
FROM t1 LEFT JOIN t2 on t2.id = t1.t2_id LEFT JOIN t3 ON t3.id = t1.t3_id
WHERE (conditions2)
LIMIT 10;
QUERY PLAN
-------------------------------------------------------------------------------------
Limit (cost=0.76..18.97 rows=1 width=171) (actual time=14.764..14.764 rows=0 loops=1)
/* snip */
EXPLAIN ANALYZE SELECT t1.*, t2.*, t3.*
FROM t1 LEFT JOIN t2 on t2.id = t1.t2_id LEFT JOIN t3 ON t3.id = t1.t3_id
WHERE (conditions6)
LIMIT 10;
QUERY PLAN
-------------------------------------------------------------------------------------
Limit (cost=0.51..24.48 rows=1 width=171) (actual time=0.252..5.332 rows=10 loops=1)
My problem is that I want to join these 6 conditions together with OR operators, making each condition a possibility. My combined query appears more like this:
EXPLAIN ANALYZE SELECT t1.*, t2.*, t3.*
FROM t1 LEFT JOIN t2 on t2.id = t1.t2_id LEFT JOIN t3 ON t3.id = t1.t3_id
WHERE (conditions1 OR conditions2 OR conditions3 OR conditions4 OR conditions5 OR conditions 6)
LIMIT 10;
Unfortunately, this results in a MASSIVE increase on the query plan, which no longer seems to be using my indexes (instead, choosing to do a hash left join
rather than a nested loop left join
, and performing various sequence scans
over the previously used index scans
).
Limit (cost=142.62..510755.78 rows=1 width=171) (actual time=30.591..30.986 rows=10 loops=1)
Is there anything special I should know about indexing with regards to OR-ed conditions that would improve my final query?
UPDATE: If I use a UNION for each individual SELECT, that seems to speed up the query. However, will that prevent me from ordering my results if I choose to in the future? Here's what I did to speed up the query via UNION:
EXPLAIN ANALYZE
SELECT t1.*, t2.*, t3.*
FROM t1 LEFT JOIN t2 on t2.id = t1.t2_id LEFT JOIN t3 ON t3.id = t1.t3_id
WHERE (conditions1)
UNION
SELECT t1.*, t2.*, t3.*
FROM t1 LEFT JOIN t2 on t2.id = t1.t2_id LEFT JOIN t3 ON t3.id = t1.t3_id
WHERE (conditions2)
UNION
SELECT t1.*, t2.*, t3.*
FROM t1 LEFT JOIN t2 on t2.id = t1.t2_id LEFT JOIN t3 ON t3.id = t1.t3_id
WHERE (conditions3)
UNION
SELECT t1.*, t2.*, t3.*
FROM t1 LEFT JOIN t2 on t2.id = t1.t2_id LEFT JOIN t3 ON t3.id = t1.t3_id
WHERE (conditions4)
UNION
SELECT t1.*, t2.*, t3.*
FROM t1 LEFT JOIN t2 on t2.id = t1.t2_id LEFT JOIN t3 ON t3.id = t1.t3_id
WHERE (conditions5)
UNION
SELECT t1.*, t2.*, t3.*
FROM t1 LEFT JOIN t2 on t2.id = t1.t2_id LEFT JOIN t3 ON t3.id = t1.t3_id
WHERE (conditions6)
LIMIT 10;
QUERY PLAN
-------------------------------------------------------------------------------------
Limit (cost=219.14..219.49 rows=6 width=171) (actual time=125.579..125.653 rows=10 loops=1)