Normally I just use TStringList.CommaText, but this wont work when a given field has multiple lines. Basically I need a csv processor that conforms to rfc4180. I'd rather not have to implement the RFC myself.
Do you really need full RFC support? I can't count the number of times I've written a "csv parser" in perl or something similar. Split on comma's and be done. The only problem comes when you need to respect quotes. If you do, write a "quotesplit" routine that looks for quotes and ensures they're balanced. Unless this csv processor is the meat and potatoes of some application, I'm not sure it'll really be a problem.
On the other hand, I really don't think fully implementing the rfc is that complex. That's a relatively short rfc in comparison to things like... HTTP, SMTP, IMAP, ...
In perl, a decent quotesplit()
I wrote is:
sub quotesplit {
my ($regex, $s, $maxsplits) = @_;
my @split;
my $quotes = "\"'";
die("usage: quotesplit(qr/.../,'string...'), // instead of qr//?\n")
if scalar(@_) < 2;
my $lastpos;
while (1) {
my $pos = pos($s);
while ($s =~ m/($regex|(?<!\\)[$quotes])/g) {
if ($1 =~ m/[$quotes]/) {
$s =~ m/[^$quotes]*/g;
$s =~ m/(?<!\\)[$quotes]/g;
}
else {
push @split, substr($s,$pos,pos($s) - $pos - length($1));
last;
}
}
if (defined(pos($s)) and $lastpos > pos($s)) {
errorf('quotesplit() issue: lastpos %s > pos %s',
$lastpos, pos($s)
);
exit;
}
if ((defined($maxsplits) && scalar(@split) == ($maxsplits - 1))) {
push @split, substr($s,pos($s));
last;
}
elsif (not defined(pos($s))) {
push @split, substr($s,$lastpos);
last;
}
$lastpos = pos($s);
}
return @split;
}
did you tried to use Delimiter := ';' and DelimiterText := instead CommaText?
btw, that RFC has no sense at all... it's absurd to Request For Comments on CSV...
Here is my CSV parser (not maybe to the RFC but it works fine). Keep calling it on a supplied string, each time it gives you the next CSV field. I dont believe it has any problems with multiple line.
function CSVFieldToStr(
var AStr : string;
ADelimChar : char = Comma ) : string;
{ Returns the next CSV field str from AStr, deleting it from AStr,
with delimiter }
var
bHasQuotes : boolean;
function HandleQuotes( const AStr : string ) : string;
begin
Result := Trim(AStr);
If bHasQuotes then
begin
Result := StripQuotes( Result );
ReplaceAllSubStrs( '""', '"', Result );
end;
end;
var
bInQuote : boolean;
I : integer;
C : char;
begin
bInQuote := False;
bHasQuotes := False;
For I := 1 to Length( AStr ) do
begin
C := AStr[I];
If C = '"' then
begin
bHasQuotes := True;
bInQuote := not bInQuote;
end
else
If not bInQuote then
If C = ADelimChar then
begin
Result := HandleQuotes( Copy( AStr, 1, I-1 ));
AStr := Trim(Copy( AStr, I+1, MaxStrLEn ));
Exit;
end;
end;
Result := HandleQuotes(AStr);
AStr := '';
end;