views:

228

answers:

2

I'm trying to get the contents of a CSV file into an array. When I've done this before I had one record per line, and used the newline character with scanUpToCharactersFromSet:intoString:, passing newlineCharacterSet as the character set:

while ([lineScanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet] 
                                   intoString:&line])

Now, I'm working with a file where many of the entries themselves contain newline characters. I've tried adding a unique character to the end of each record (a * character) but my loop only runs once. Is there something which is making the while loop break that I don't know about? Here's the code I'm using now:

NSError *error;
NSString *data = [[NSString alloc] initWithContentsOfFile:[[self delegate] filePath] encoding:NSUTF8StringEncoding error:&error];
NSScanner *lineScanner = [NSScanner scannerWithString:data];
NSString *line = nil;

// Start parsing the CSV file
while ([lineScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:@"*"]
                    intoString:&line]) {
    NSArray *elements = [line componentsSeparatedByString:@","];
    NSLog("Name: %@", [elements objectAtIndex:1]);
}

*Edit: * Thanks to Peter's answer below, I found that my scanner was stuck behind the * character. I added this line in the loop:

[lineScanner scanCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:@"*"] intoString:NULL];

and now it's working like it should.

A: 

I think the while condition is wrong. According to the String Programming Guide, it should be something like:

while ([theScanner isAtEnd] == NO) {
    [lineScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:@"*"] intoString:&line]
    // ...
}
Laurent Etiemble
As long as there's anything yet to be scanned, `isAtEnd` will not return `YES`. Thus, that's an infinite loop, since it never scans past the end of the first line. See my answer.
Peter Hosey
Thanks for the point.
Laurent Etiemble
+2  A: 

Let's go through one pass at a time:

  1. First:

    while ([lineScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:[NSCharacterSet newlineCharacterSet]] intoString:&line]) {
    

    The scanner puts everything before the line break into line. It advances up to the newline.

  2. Second:

    while ([lineScanner scanUpToCharactersFromSet:[NSCharacterSet characterSetWithCharactersInString:[NSCharacterSet newlineCharacterSet]] intoString:&line]) {
    

    The scanner is already on a line break, so it scans no characters. As documented, since it scanned no characters, it returns NO. Your loop terminates.

The solution is to scan the line break at the end of the loop, to get the scanner past it. You can pass NULL for the output parameter, assuming you don't care what the line break was.

This is correct behavior: If you did/do care what the characters you scanned up to were, this lets you obtain them. That would be more difficult if NSScanner scanned past the characters automatically.

Peter Hosey
Thanks for the answer Peter. The loop that used `[NSCharacterSet newlineCharacterSet]` was working OK with a CSV file with one record per line. With the multi-line record CSV, I put a * at the end of each record and tried `[NSCharacterSet characterSetWithCharactersInString:@"*"]` but it failed. Presumably stuck at the * character. I thought about your answer and added the code above. Now it works. Thanks.
nevan