views:

46

answers:

1

Question to SQLite database engine. I have fairly complex question, which finds path in directed graph between two sets of known nodes, with exactly one node between (to be precise, path in public transport routes, but it has graph representation).

Now something unexpected - there are no explicit JOINs in question (only conditions in WHERE clause), but ordering of tables in FROM clause affects query plan very much.


For clause

from przystanek skad, przystanek dokad, linia l1, linia l2, przystanek posredni1, przystanek posredni2
from KNOWN_START_NODE, KNOWN_END_NODE, unknown_arc1, unknown_arc1, unknown_node_middle1, unknown_node_middle2

query plan is:

TABLE linia AS l1 WITH INDEX linia_id_miasto
TABLE przystanek AS skad WITH INDEX przystanek_id_linia
TABLE linia AS l2 WITH INDEX linia_id_miasto
TABLE przystanek AS dokad WITH INDEX przystanek_id_linia
TABLE przystanek AS posredni1 WITH INDEX przystanek_id_linia
TABLE przystanek AS posredni2 WITH INDEX przystanek_linia_nrprzystanku

and query takes 0.14 seconds


For clause

from linia l1, linia l2, przystanek posredni1, przystanek posredni2, przystanek skad, przystanek dokad
from unknown_arc1, unknown_arc1, unknown_node_middle1, unknown_node_middle2, KNOWN_START_NODE, KNOWN_END_NODE

query plan is:

TABLE linia AS l1 WITH INDEX linia_id_miasto
TABLE linia AS l2 WITH INDEX linia_id_miasto
TABLE przystanek AS posredni1 WITH INDEX przystanek_id_linia
TABLE przystanek AS posredni2 WITH INDEX przystanek_linia_nrprzystanku
TABLE przystanek AS skad WITH INDEX przystanek_id_linia
TABLE przystanek AS dokad WITH INDEX przystanek_id_linia

and query takes 4.90 seconds


Why such differences? I've read http://www.sqlite.org/optoverview.html but there is nothing about FROM table ordering. It took me a few hours to track down reason of poor performance and I still don't know what happened.

Here is full question, not very clear and without table volumes, but still may be usable

select 
l1.nazwapliku, 
l2.nazwapliku, 
posredni1.nrprzystanku as nrposr,
skad.kolejnosc as k1a, 
posredni1.kolejnosc as k1b, 
posredni2.kolejnosc as k2a, 
dokad.kolejnosc as k2b,
skad.nrprzystanku, 
dokad.nrprzystanku

from przystanek skad, przystanek dokad, linia l1, linia l2, przystanek posredni1, przystanek posredni2
where 

skad.nrprzystanku IN (1) AND
dokad.nrprzystanku IN (2) AND

l1.id_miasto = 1 AND
l2.id_miasto = 1 AND
l1._id <> l2._id AND
l1.nazwalinii <> l2.nazwalinii AND
posredni1._id<>posredni2._id AND

skad.id_linia = l1._id AND
posredni1.id_linia = l1._id AND
skad.kolejnosc<posredni1.kolejnosc AND

posredni1.nrprzystanku=posredni2.nrprzystanku AND
posredni2.id_linia = l2._id AND
dokad.id_linia = l2._id AND

posredni2.kolejnosc<dokad.kolejnosc 
A: 

AFAIK, the little SQLite does for query optimisation depends mostly on WHERE clauses, and usually the first WHERE clause is where the most gain is to be made.

MPelletier
In this case the only difference is ordering of tables in FROM clause.
tomash
I know this is poor form, but perhaps it would be best to ask the SQLite people themselves in this case...
MPelletier