ansaurus

Question

Regular Expression to break row with comma separated values into distinct rows

Answer 1

A:

stakx 2010-05-09 10:07:33

Damn... Thats not good news!

Nick 2010-05-09 10:09:39

There is only 1 column with multiple values (the district), eg AB10, AB11, etc...

Nick 2010-05-09 10:19:55

Answer 2

A:

I agree that RegEx are not be the best way but this should work hopefully if that's all you have available to you. (Done repeatedly until there are no more matches)

Edit

Updated with the OP's final solution from the comments.

Find: (.+)\t([^,\s]+),([^\t]+)\t(.+)
Replace: \1\t\2\t\4\r\1\t\3\t\4

Martin Smith 2010-05-09 10:27:59

That works very well - thanks... This is a slight modification which worked a little better. Thanks!Find: "(.+)\t([^,\s]+),([^\t]+)\t(.+)"Replace: "\1\t\2\t\4\r\1\t\3\t\4"EDIT: Seems these comments cant contain markup...

Nick 2010-05-09 10:45:29

Yes the penny just dropped that your fields were tab delimited and I had just come back to update my post. Glad you spotted it!

Martin Smith 2010-05-09 10:46:59

@Nick RE: Markup in the comments you can do a limited amount. See the accepted answer here http://meta.stackoverflow.com/questions/4481/apply-markup-code-in-comments

Martin Smith 2010-05-09 10:49:19

Answer 3

A:

I agree with stakx that this doesn't sound like a good place for regexes.

I would write a small program instead which read each line, split the line into columns, split each relevant column into a list of values, and then iterated over all combinations of those, outputting a line each time.

Assuming it's only that one column which can have multiple tokens, it would basically look like this:

while not InputFile.EndOfFile:
  line = InputFile.readline();
  columns = line.split('\t'); //Assuming 1-based array, so indexes 1-4
  col2values = columns[2].split(',');
  for each value in col2values:
    OutputFile.WriteLine(columns[1]+'\t'+value+'\t'+columns[3]+'\t'+columns[4]);

If multiple columns can have multiple values, simply put another loop inside the for each.

Michael Madsen 2010-05-09 10:29:06

Yeah a script would work too - was just hoping that a Regex would do the trick.

Nick 2010-05-09 10:46:55

ansaurus

tags:

views:

answers:

Regular Expression to break row with comma separated values into distinct rows

related questions