views:

73

answers:

2

EDIT: I found the problem, which was totally unrelated to the below. It was due to me doing a bitwise comparison that I failed to mention below - I thought it wasn't relevant - I removed this (and found an alternative solution) and my query now executes in < 1 second on device.

I'm currently working on my first iPhone application which is coming along nicely. It is an application to lookup train timetables for a certain train network. I have an SQLite database containing about 11,000 rows which hold the data I use to plan a journey.

The structure of the table is as follows:

from     depart      to         arrive     trainid
--------------------------------------------------
STNA     06:20       STNB       06:24      TRAINA
STNB     06:25       STNC       06:29      TRAINA
STNC     06:30       STND       06:35      TRAINA
STNA     07:23       STNC       07:30      TRAINB
STNC     07:32       STNE       07:40      TRAINB
STNE     07:41       STNF       07:50      TRAINB

The FROM and TO fields are 3 letter IDS of stations on the route, the DEPART and ARRIVE fields are timestamps and the TRAINID field is an identifier common to all stations on a particular train's journey (ie, all stops on the 06:20 train from STNA to STND share the same trainid).

When I want to for example find out which trains will take me from STNA to STND, I do the following query:

SELECT t1.from, t1.depart, t2.to, t2.arrive, t1.trainid FROM trains t1, trains t2 WHERE t1.trainid = t2.trainid AND t1.depart < t2.arrive AND t1.from = 'STNA' AND t2.to = 'SNTD';

This does a self join on the trains table (the t1.depart < t2.arrive part is so that it finds trains from STNA->STND instead of STND->STNA).

This works well, however it is very slow on the iphone (ok so you might say not so well), it takes about 10 seconds to run (returns about 80 results). I realise this is most probably becuase of the fact an 11,000 row table is being joined onto itself - but I'm at a loss at how I can optimise this. I'm using FMDB for this by the way.

I'm thinking I need to go down the CoreData route, but as a newcomer to the iPhone platform I was hoping to avoid that for this project. Do you think CoreData would offer significant performance benefits in this scenario? As someone from a purely SQL background I would appreciate any pointers on how to best to proceed with CoreData - can I do self-joins with CoreData, or should I be modelling my data differently?

EDIT: This is the output of an explain on the query im doing:

add  opcode        p1     p2     p3     p4                                        p5     comme
---  ------------  -----  -----  -----  ----------------------------------------  -----  -----
0    Trace         0      0      0                                                00          
1    String8       0      1      0      GRY                                       00          
2    String8       0      2      0      FST                                       00          
3    Goto          0      47     0                                                00          
4    OpenRead      0      2      0      7                                         00          
5    OpenRead      2      377    0      keyinfo(1,BINARY)                         00          
6    OpenRead      1      2      0      7                                         00          
7    OpenRead      3      1107   0      keyinfo(1,BINARY)                         00          
8    IsNull        1      42     0                                                00          
9    Affinity      1      1      0      ab                                        00          
10   SeekGe        2      42     1      1                                         00          
11   IdxGE         2      42     1      1                                         01          
12   IdxRowid      2      3      0                                                00          
13   Seek          0      3      0                                                00          
14   Column        0      6      4      0                                         00          
15   IsNull        4      41     0                                                00          
16   Affinity      4      1      0      db                                        00          
17   SeekGe        3      41     4      1                                         00          
18   IdxGE         3      41     4      1                                         01          
19   IdxRowid      3      3      0                                                00          
20   Seek          1      3      0                                                00          
21   Column        0      5      3                                                00          
22   Column        1      5      5                                                00          
23   Ne            5      40     3      collseq(BINARY)                           6a          
24   Column        0      3      5      0                                         00          
25   RealAffinity  5      0      0                                                00          
26   Column        1      3      3      0                                         00          
27   RealAffinity  3      0      0                                                00          
28   Ge            3      40     5      collseq(BINARY)                           6b          
29   Column        1      2      3                                                00          
30   Ne            2      40     3      collseq(BINARY)                           69          
31   Column        2      0      6                                                00          
32   Column        0      3      7      0                                         00          
33   RealAffinity  7      0      0                                                00          
34   Column        1      2      8                                                00          
35   Column        1      4      9      0                                         00          
36   RealAffinity  9      0      0                                                00          
37   Column        0      5      10                                               00          
38   Column        0      6      11     0                                         00          
39   ResultRow     6      6      0                                                00          
40   Next          3      18     0                                                00          
41   Next          2      11     0                                                00          
42   Close         0      0      0                                                00          
43   Close         2      0      0                                                00          
44   Close         1      0      0                                                00          
45   Close         3      0      0                                                00          
46   Halt          0      0      0                                                00          
47   Transaction   0      0      0                                                00          
48   VerifyCookie  0      8      0                                                00          
49   TableLock     0      2      0      stops                                     00          
50   Goto          0      4      0                                                00          
+1  A: 

Although it can use several different backing stores, SQLite is one of the ones CoreData can use.

For that reason, you shouldn't expect any better performance from CoreData. There's nothing magic about CoreData.

I suspect the problem is with your query. What's the output of explain plan on your query? I think you'll find it's doing a full table scan and an index created on the proper columns can reduce your query time to almost nothing.

Wade Williams
I have an index on every column (at some point in the application, each column is used to lookup data). On the simulator, the query takes < 2 seconds to run which is the kind of speed I'm aiming for on device. I guess the difference at the moment is due to resource constraints on device.I don't really understand the explain output of SQLite - I'm much more familiar with mysql. I have updated my question above to include the explain for the query.
Mailslut
Actually, owing to aggressive hand tuning of Core Data under the hood, it is usually faster at SQL access than manual code. When you add in the overhead of converting SQL into objects manually, you almost always come out ahead using Core Data.
TechZen
^^ that is what I was getting at.
Mailslut
A: 

This is a more tangential answer - If there are no direct trains between Station A and Station D, meaning your only way is connections - Station A to Station X via TrainA and then via TrainM from Station X to Station D your SQL is going to get pretty unwieldy isn't it. I had to use graphs ( a math formulation )to get this done in a reasonable amount of time. Eventually I've got something going at www.bharatbyrail.com. See it gives you some ideas.

skk
I decided to only use direct trains. The train network in question is quite small and the trains are frequent enough to make including only direct trains not too much of a limitation.
Mailslut