views:

119

answers:

2

I am trying to use sqlite (sqlite3) for a project to store hundreds of thousands of records (would like sqlite so users of the program don't have to run a [my]sql server).

I have to update hundreds of thousands of records sometimes to enter left right values (they are hierarchical), but have found the standard

update table set left_value = 4, right_value = 5 where id = 12340;

to be very slow. I have tried surrounding every thousand or so with

begin;
....
update...
update table set left_value = 4, right_value = 5 where id = 12340;
update...
....
commit;

but again, very slow. Odd, because when I populate it with a few hundred thousand (with inserts), it finishes in seconds.

I am currently trying to test the speed in python (the slowness is at the command line and python) before I move it to the C++ implementation, but right now this is way to slow and I need to find a new solution unless I am doing something wrong. Thoughts? (would take open source alternative to SQLite that is portable as well)

+3  A: 

Create an index on table.id

create index table_id_index on table(id)
Doug Currie
Wow, that seems to have worked. Why and where can I read more about it?
blackrim
An overview of the SQLite3 optimizer is at http://www.sqlite.org/optoverview.html
Doug Currie
@blackrim: Same page: http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html#indexes
Brian R. Bondy
+2  A: 

Other than making sure you have an index in place, you can checkout the SQLite Optimization FAQ.

Using transactions can give you a very big speed increase as you mentioned and you can also try to turn off journaling.

Example 1:

2.2 PRAGMA synchronous

The Boolean synchronous value controls whether or not the library will wait for disk writes to be fully written to disk before continuing. This setting can be different from the default_synchronous value loaded from the database. In typical use the library may spend a lot of time just waiting on the file system. Setting "PRAGMA synchronous=OFF" can make a major speed difference.

Example 2:

2.3 PRAGMA count_changes

When the count_changes setting is ON, the callback function is invoked once for each DELETE, INSERT, or UPDATE operation. The argument is the number of rows that were changed. If you don't use this feature, there is a small speed increase from turning this off.

Brian R. Bondy
Found that page, but those didn't seem to help too much. Maybe shaving a second here or there, but only about 1% or less.
blackrim