views:

577

answers:

3

I'm inserting a batch of 100 records, each containing a dictonary containing arbitrarily long HTML strings, and by god, it's slow. On the iphone, the runloop is blocking for several seconds during this transaction. Is my only recourse to use another thread? I'm already using several for acquiring data from HTTP servers, and the sqlite documentation explicitly discourages threading with the database, even though it's supposed to be thread-safe... Is there something I'm doing extremely wrong that if fixed, would drastically reduce the time it takes to complete the whole operation?

    NSString* statement;
    statement = @"BEGIN EXCLUSIVE TRANSACTION";
    sqlite3_stmt *beginStatement;
    if (sqlite3_prepare_v2(database, [statement UTF8String], -1, &beginStatement, NULL) != SQLITE_OK) {
        printf("db error: %s\n", sqlite3_errmsg(database)); 
        return;
    }
    if (sqlite3_step(beginStatement) != SQLITE_DONE) {
        sqlite3_finalize(beginStatement);
        printf("db error: %s\n", sqlite3_errmsg(database)); 
        return;
    }

    NSTimeInterval timestampB = [[NSDate date] timeIntervalSince1970];
    statement = @"INSERT OR REPLACE INTO item (hash, tag, owner, timestamp, dictionary) VALUES (?, ?, ?, ?, ?)";
    sqlite3_stmt *compiledStatement;
    if(sqlite3_prepare_v2(database, [statement UTF8String], -1, &compiledStatement, NULL) == SQLITE_OK)
    {
        for(int i = 0; i < [items count]; i++){
            NSMutableDictionary* item = [items objectAtIndex:i];
            NSString* tag       = [item objectForKey:@"id"];
            NSInteger hash      = [[NSString stringWithFormat:@"%@%@", tag, ownerID] hash];
            NSInteger timestamp = [[item objectForKey:@"updated"] intValue];
            NSData *dictionary  = [NSKeyedArchiver archivedDataWithRootObject:item];

            sqlite3_bind_int(   compiledStatement, 1, hash);
            sqlite3_bind_text(  compiledStatement, 2, [tag UTF8String], -1, SQLITE_TRANSIENT);
            sqlite3_bind_text(  compiledStatement, 3, [ownerID UTF8String], -1, SQLITE_TRANSIENT);
            sqlite3_bind_int(   compiledStatement, 4, timestamp);
            sqlite3_bind_blob(  compiledStatement, 5, [dictionary bytes], [dictionary length], SQLITE_TRANSIENT);

            while(YES){
                NSInteger result = sqlite3_step(compiledStatement);
                if(result == SQLITE_DONE){
                    break;
                }
                else if(result != SQLITE_BUSY){
                    printf("db error: %s\n", sqlite3_errmsg(database)); 
                    break;
                }
            }
            sqlite3_reset(compiledStatement);
        }
        timestampB = [[NSDate date] timeIntervalSince1970] - timestampB;
        NSLog(@"Insert Time Taken: %f",timestampB);

        // COMMIT
        statement = @"COMMIT TRANSACTION";
        sqlite3_stmt *commitStatement;
        if (sqlite3_prepare_v2(database, [statement UTF8String], -1, &commitStatement, NULL) != SQLITE_OK) {
            printf("db error: %s\n", sqlite3_errmsg(database)); 
        }
        if (sqlite3_step(commitStatement) != SQLITE_DONE) {
            printf("db error: %s\n", sqlite3_errmsg(database)); 
        }

        sqlite3_finalize(beginStatement);
        sqlite3_finalize(compiledStatement);
        sqlite3_finalize(commitStatement);
A: 

Did you try the same as your code but with "BEGIN" and "COMMIT" instead of "BEGIN EXCLUSIVE TRANSACTION" and "COMMIT TRANSACTION" ?

I'm simply using BEGIN and COMMIT and it's pretty much faster than committing for each transaction so I guess it's working with those keywords.

http://www.sqlite.org/lang_transaction.html

yonel
I just tried this. I can discern no noticable difference in performance. The "EXCLUSIVE" keyword might make a difference, but isn't BEGIN TRANSACTION just an alias for BEGIN?
akaii
:( Hum, I couldn't say if the two keywords are the same.
yonel
Hrmm, supposing the effect is different, the operation is still pretty heavy all the same. You said it was much faster... how many records are you inserting into the table at a given time?
akaii
It was between 1500/2000, I don't have the numbers in mind, but as far as I remember the ratio with/without the begin/commit was quite high.
yonel
A: 

The thing that you need to be aware of is that the SQLite documentation warns you away from accessing/writing to the database from multiple threads. As long as you access the database from a single thread, you'll be fine. It doesn't matter if that thread is your program's main thread or some other thread.

Keep in mind that compiled version of SQLite on the iPhone has its threading mode set to "multi-thread" which, according to the documentation, "disables mutexing on database connection and prepared statement objects. The application is responsible for serializing access to database connections and prepared statements but other mutexes are enabled so that SQLite will be safe to use in a multi-threaded environment as long as no two threads attempt to use the same database connection at the same time." So, if you do decide to put this transaction on another thread, be careful of what else you try to do with the database.

That being said, I'd first follow Yonel's advice and switch to "BEGIN" AND "COMMIT". If that doesn't help, move the transaction to another thread. Working with "blobs" can be pretty slow, from what I've heard.

Anthony
1) If I remove the long HTML string from the dictionary blob, and place it into a text column instead (the blob would be much smaller in this case, all the other data in the dictionary are fairly short strings/arrays of strings), do you think this would help significantly?2) While putting all db transactions in a thread would keep my UI thread from blocking, it would not actually help shorten the actual transaction time (in fact, it would probably get worse, since I'd need to manage and pass messages between threads). This is something I want to avoid if at all possible.
akaii
I do think that it would be faster to put the HTML strings in a text column. I've done the same thing in a program I've been working on, and the insert completes much faster than yours does. Give it a try. I'd love to hear how it works out for you.You are right to say that putting the work in a separate thread won't speed up the process, but it will certainly improve the responsiveness of the UI, which is incredibly important.
Anthony
I intend to do that, thanks. I plan to a) shrink my dictionary by discarding data I don't need to store, and b) split my dictionary into text/varchars at storage time and reassemble it when i need it... ill still need for some blobs for a couple of arrays contained within my dictionary, but theyre far far smaller than the huge html string that occupies most of the dictionary's space. hopefully, this will improve things.
akaii
Would calling NSInvocationOperation be okay so long as I open a new connection each time I want to carry out a transaction? Particularly, I'm probably going to carry out deletes and inserts/updates with NSInvocationOperation, fire and forget. Selects...are another matter I'll have to deal with seperately, since I actually need the results after the operation is finished.
akaii
It should work, but be careful not to access the database more than once from different threads. If you do try to create two or more connections to the database at the same time, your application will crash.
Anthony
Things have mostly worked out. My only problem is retrieving data now...I really have no choice but to block there, since I can't proceed to the next view without the data ready.
akaii
A: 

I see a lot of cases where developers new to the iPhone believe the code is slow when it is simply a case of asking lightweight hardware to do to much processing. Processing several hundred (thousands?) of "arbitrarily long HTML strings" might to heavy a task for the iPhone to carry out in a timely fashion.

Remember that the iPhone isn't a very powerful piece of hardware. It pulls off all the nifty graphics with dedicated hardware whose computational power you can't access for other task. Even if you optimize the code it maybe way slower than you would intuitively expect based on your experience with full fledged laptops and desktops.

Instead of guessing where the bottleneck is, I suggest that you profile the code with Instruments (or even just using NSLog with timestamps) to see exactly where the code is spending most of its time.

TechZen
I have used timestamps... via NSTimeIntervals. There are other tasks being done that are fairly weighty as well and can take up to several seconds, but all of those are either asynchronous or packed into another thread. I assure you, when I say 5-17 seconds, I am not pulling those numbers out of air. I am getting those from calling NSLog on my timestamps. I have also tried Instruments the task that consumes the most time on the main thread are my inserts. I specifically put them into this thread because the documentation discourages multithreading with the DB.
akaii
It sounds like you may indeed be overloading the hardware. I would try testing with lighter weight data i.e. the same number of records but with significantly smaller blobs. If it speeds up significantly then your just trying to do to much.
TechZen