views:

46

answers:

1

Hello all (but mostly Dave),

I am playing around with Dave DeLong's excellent CHCSVParser for Objective-C with an extremely long .CSV file and am running into some trouble using it. I would use the arrayWithContentsOfCSVFile method, but I'm running the code on an iPhone and parsing the whole file into memory would take more memory than is available.

In my code below, the parser opens the document and calls the delegate methods perfectly, but where in the delegate do I stop after each line and access the data (to create and save a Core Data object to the data store)? I assume that would be in - (void) parser:(CHCSVParser *)parser didEndLine:(NSUInteger)lineNumber, but how do I get an NSArray (or whatever) of the data from the parser when it's done with the line?

Here is my code so far:

//
// The code from a method in my view controller:
//
NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *documentsDirectory = [paths objectAtIndex:0];
NSFileManager *manager = [NSFileManager defaultManager];
NSError *err = nil;
NSArray *fileList = [manager contentsOfDirectoryAtPath:documentsDirectory error:&err];
NSString *fileName = [fileList objectAtIndex:1];
NSURL *inputFileURL = [NSURL fileURLWithPath: [documentsDirectory stringByAppendingPathComponent:fileName]];


NSStringEncoding encoding = 0;
CHCSVParser *p = [[CHCSVParser alloc] initWithContentsOfCSVFile:[inputFileURL path] usedEncoding:&encoding error:nil];
[p setParserDelegate:self];
[p parse];
[p release];

...

#pragma mark -
#pragma mark CHCSVParserDelegate methods

- (void) parser:(CHCSVParser *)parser didStartDocument:(NSString *)csvFile {
    NSLog(@"Parser started!");
}

- (void) parser:(CHCSVParser *)parser didStartLine:(NSUInteger)lineNumber {
    //NSLog(@"Parser started line: %i", lineNumber);
}

- (void) parser:(CHCSVParser *)parser didEndLine:(NSUInteger)lineNumber {
    NSLog(@"Parser ended line: %i", lineNumber);
}

- (void) parser:(CHCSVParser *)parser didReadField:(NSString *)field {
    //NSLog(@"Parser didReadField: %@", field);
}

- (void) parser:(CHCSVParser *)parser didEndDocument:(NSString *)csvFile {
    NSLog(@"Parser ended document: %@", csvFile);
}

- (void) parser:(CHCSVParser *)parser didFailWithError:(NSError *)error {
    NSLog(@"Parser failed with error: %@ %@", [error localizedDescription], [error userInfo]);
}

Thanks!

+2  A: 

Hi Neal,

I'm glad to see that my code is proving useful! :)

CHCSVParser is similar in behavior to an NSXMLParser, in that every time it finds something interesting, it's going to let you know via one of the delegate callbacks. However, if you choose to ignore the data that it gives you in the callback, then it's gone. These parsers (CHCSVParser and NSXMLParser) are pretty stupid. They just know the format of the stuff they're trying to parse, but don't really do much beyond that.

So the answer, in a nutshell, is "you have to save it yourself". If you look at the code for the NSArray category, you'll see in the .m file that it's using a simple NSObject subclass as the parser delegate, and that subclass is what's aggregating the fields into an array, and then adding that array to the overall array. You'll need to do something similar.

Example delegate:

@interface CSVParserDelegate : NSObject <CHCSVParserDelegate> {
  NSMutableArray * currentRow;
}
@end

@implementation CSVParserDelegate

- (void) parser:(CHCSVParser *)parser didStartLine:(NSUInteger)lineNumber {
  currentRow = [[NSMutableArray alloc] init];
}
- (void) parser:(CHCSVParser *)parser didReadField:(NSString *)field {
  [currentRow addObject:field];
}
- (void) parser:(CHCSVParser *)parser didEndLine:(NSUInteger)lineNumber {
  NSLog(@"finished line! %@", currentRow);
  [self doSomethingWithLine:currentRow];
  [currentRow release], currentRow = nil;
}
@end

However, I could be convinced to modify the behavior of the parser to aggregate the row itself, but if I go down that route, why not just have the parser aggregate the entire file? (Answer: it shouldn't)

Dave DeLong
Thanks for the answer - I couldn't tell if the parser did this work for you (through `currentChunk` or some other way)! It would be nice not to have to manually do the aggregation, but for most apps CSV parsing is only really used in a few places for I/O.If someone wanted to write that little bit of code once, would it be possible to subclass or write a category on the CHCSVParserDelegate?
Neal L
@Neal `CHCSVParserDelegate` is a protocol, so you can't subclass it. In order to get this behavior to work, you'd have to alter `CHCSVParser` directly or subclass it. The simplest answer is to aggregate the line yourself (like I do in the answer)
Dave DeLong
Didn't think so... Well thanks again very much for the wonderful work here! Actually a minor correction to your code... in `didEndLine` we need to change `currentLine` to `currentRow`. (i'd change it but don't have enough XP ;))
Neal L