views:

63

answers:

3

I have a database that is filled with information for films, which is (in turn) read in to the database from an XML file on a webserver.

What happens is the following:

Gather/Parse XML and store film info as objects
Begin Statement
For every film object we found:
    Check to see if record for film exists in database
    If no film record, write data for film
Commit Statement

Currently I just test for the existence of a film using (the very basic):

SELECT film_title FROM film WHERE film_id = ?

If that returns a row, then the film exists, if not then I need to add it...

I have also tried using

INSERT INTO film (film_id, film_title, film_mStar, film_dStar, film_date) SELECT ?, ?, ?, ?, ?, ?, ? WHERE NOT EXISTS (SELECT 1 FROM film WHERE film_id = ?)

To the exact same effect in terms of processing time...

The only problem is, is that there are many many hundreds of records in the database (lots of films!) and because it has to check for the existence of a film in the database before it can write it, the whole process ends up taking quite a while (about 27 seconds for 210 films)

Is there a more efficient method of doing this, or just any suggestions in general?

Programming Language is Objective-C, database is in sqlite3, platform is iPhone/iPad/iPodTouch

Thanks, Dwaine

A: 

You could try adding an index on film_id to speed up the search for existing films.

Something like this should do the job:

CREATE INDEX IF NOT EXISTS film_index_film_id ON film (film_id)

If your unique key is actually something else (not just film_id, but maybe title and year) then you should include all the columns that you use in your index.

Thomas Müller
A: 

If film_id is unique per film, you could consider something like:

SELECT film_id FROM film;

at startup, and cache all the ids in an array or similar. Then when you read a new record from your XML, check for the existence of the film_id in the array and if it's not there, add it both to the array and the database. That way you only SELECT once per file rather than once per film.

Edit: Another thing to consider would be transactions. At startup:

BEGIN TRANSACTION;

and after all your insertions:

COMMIT;
chrisbtoo
A: 

Create an array of booleans for filmIDs (it can just be a bit-map) Then set the bits to false/true for each corresponding address in the array. Then instead of using sql to check if it exists, check it against the array. On insertion, add the filmid = true to the array.

Alex O