i am trying to parse a file that looks like this:
|| Column Header A || Column Header B || Column Header C ||CRLF
| Data A | Data B | Data C |CRLF
| Data A | Data B | Data C |CRLF
"CRLF" represents a line break
i had code to parse this fine:
I first parse the file into an array of lines:
string[] lines = fileString.Split(Environment.NewLine.ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
Then i parse each row to an array of column data values,
First, I parse to get the header using:
string Delimiter = "||";
string[] columns = line.Split(new string[] { Delimiter }, StringSplitOptions.RemoveEmptyEntries);
then parse the rest of the rows using
string Delimiter = "|";
string[] columns = line.Split(new string[] { Delimiter }, StringSplitOptions.RemoveEmptyEntries);
this worked perfectly until i found a record that had a CRLF inside of a field so my parsing broke up
Can anyone think of a good way to parse this data below that factors in the fact that a field in a row may have a CRLF. Here is an example:
|| Column Header A || Column Header B || Column Header C ||CRLF
| Data A | Data B | Data C |CRLF
| Data A | Data B CRLF Continued B | Data C |CRLF
the issue is that when i do the initial parsing to get the array of lines, i get 4 lines here instead of 3 (because the last line shows up as two entries in that array.