views:

235

answers:

1

I have a class I use to "split" a string of SQL commands by a batch separator - e.g. "GO" - into a list of SQL commands that are run in turn etc.

...
private static IEnumerable<string> SplitByBatchIndecator(string script, string batchIndicator)
{
    string pattern = string.Concat("^\\s*", batchIndicator, "\\s*$");
    RegexOptions options = RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.Multiline;
    foreach (string batch in Regex.Split(script, pattern, options))
    {
        yield return batch.Trim();
    }
}

My current implementation uses a Regex with yield but I am not sure if it's the "best" way.

  • It should be quick
  • It should handle large strings (I have some scripts that are 10mb in size for example)
  • The hardest part (that the above code currently does not do) is to take quoted text into account

Currently the following SQL will incorrectly get split:

var batch = QueryBatch.Parse(@"-- issue...
insert into table (name, desc)
values('foo', 'if the
go
is on a line by itself we have a problem...')");

Assert.That(batch.Queries.Count, Is.EqualTo(1), "This fails for now...");

I have thought about a token based parser that tracks the state of the open closed quotes but am not sure if Regex will do it.

Any ideas!?

+2  A: 

You can track the opening and closing quotes using a Balancing Group Definition.

Also, a similar question was asked last year about splitting on whitespace as long as the whitespace wasn't contained in quotes. You might be able to adjust those answers to get where you're going.

E.Z. Hart
Looks promising, I'll give them a go soon, thanks, PK :-)
Paul Kohler
The 'balancing group definition' was close but not quite what I needed - the problem was you need pairs of quotes so it doesn't fit well with a commented out single quote in the SQL for example. Nice method none the less!
Paul Kohler