ansaurus

Question

How can I optimize multiple nested SELECTs in SQLite (w/Python)?

Answer 1

+1 A:

For starters, create an index:

CREATE INDEX messages_sender_by_day ON messages (sender, day);

(You probably don't need to include "hour" in there.)

If that doesn't help or you've already tried it, then please fix up your question a bit: give us some code to generate test data and SQL for all indexes on the table.

Maintaining a count cache is fairly common, but I can't tell if that's needed here.

Glenn Maynard 2010-02-05 00:25:41

Answer 2

+2 A:

first of all you can use the group by clause:

select count(*), sender from messages group by sender;

and with this you execute one query for all senders instead of on query for each sender. Another possibility could be:

select count(*), sender, day, hour
    from messages group by sender, day, hour
    order by sender, day, hour;

i didn't test it but at least now you know the existances of group by clause. this should reduce the number of queries and i think this is the first big step to increase performance.

second, create indexes based on search columns, in your case sender, day and hour.

if this isn't enough use profiling tools to find where the most the time is spent. you should also consider the use of fetchmany instead of fetchall to keep low memory consumption. remember that since sqlite module is coded in C use it as much as possible.

mg 2010-02-05 00:33:24

Thanks, this is exactly what I was looking for: a way to get the data needed in as few queries as possible. This is MUCH faster.

Justin 2010-02-05 17:07:40

ansaurus

tags:

views:

answers:

How can I optimize multiple nested SELECTs in SQLite (w/Python)?

related questions