views:

400

answers:

1

I have a file with contents something like this:

INSERT INTO table VALUES (NULL,'° F','Degrees Fahrenheit');
INSERT INTO table VALUES (NULL,'° C','Degrees Celsius');

Now, to parse this, I have something like this:

NSString *sql = [NSString stringWithContentsOfFile:filename];

Printing this string to the console looks correct. Then, I want to make an actual insert statement out of it:

const char *sqlString = [query UTF8String];
const char *endOfString;
if (sqlite3_prepare_v2(db, sqlString + nextStatementStart, -1, &stmt, &endOfString) != SQLITE_OK) {
  return NO;
}

At this point, checking the query still returns the correct result. That is, sqlite_sql(stmt) returns INSERT INTO table VALUES (NULL,'° F','Degrees Fahrenheit');

So then I run it with sqlite3_step(stmt);

At this point, viewing the database reveals this:

1|° F|Degrees Fahrenheit

I'm not using any _16 functions anywhere (like sqlite_open16).

Where is the encoding issue? How do I fix this?

+1  A: 

stringWithContentsOfFile: is deprecated as of OSX 10.4 (not sure about iPhone), but you want to specify the encoding here using stringWithContentsOfFile:encoding:error

file contains:

CREATE TABLE myvalues (foo TEXT, bar TEXT, baz TEXT);
INSERT INTO myvalues VALUES (NULL,'° F','Degrees Fahrenheit');

test.m contains (no error handling provided):

int main(int argc, char** argv)
{
    NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];

    NSString* s = [NSString stringWithContentsOfFile:@"file"
                            encoding:NSUTF8StringEncoding
                        error:nil];
    NSLog(@"s: %@", s);

    sqlite3* handle;
    sqlite3_open("file.db", &handle);
    sqlite3_exec(handle, [s UTF8String], NULL, NULL, NULL);
    sqlite3_close(handle);

    [pool release];
}

Then dumping:

% sqlite3 file.db
SQLite version 3.6.12
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .dump
BEGIN TRANSACTION;
CREATE TABLE myvalues (foo TEXT, bar TEXT, baz TEXT);
INSERT INTO "myvalues" VALUES(NULL,'° F','Degrees Fahrenheit');
COMMIT;
nall
Thank you. Do you have any idea why it would work out using the second method and not the first, even though both strings print out the exact same in the debug console?
Ed Marty
I believe the first encodes it as an ASCII string, so it's read in as two characters: 0xC2 (negate) 0xB0 (infinity). Then when you issue UTF8String, these two UTF characters come out. But if you read it in as a UTF8 encoded string, you read it in as 1 character: 0xC2B0 (the degree symbol). Search for degree here: http://www1.tip.nl/~t876506/utf8tbl.html
nall
Strict ASCII is only 0x00..0x7F (although the last time I tried NSASCIIStringEncoding, Cocoa behaved the same as NSISOLatin1StringEncoding, IIRC—but never rely on that). It looks like it's reading it as MacRoman.
Peter Hosey