I have a class I use to "split" a string of SQL commands by a batch separator - e.g. "GO" - into a list of SQL commands that are run in turn etc.
...
private static IEnumerable<string> SplitByBatchIndecator(string script, string batchIndicator)
{
string pattern = string.Concat("^\\s*", batchIndicator, "\\s*$");
RegexOptions options = RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.Multiline;
foreach (string batch in Regex.Split(script, pattern, options))
{
yield return batch.Trim();
}
}
My current implementation uses a Regex
with yield
but I am not sure if it's the "best" way.
- It should be quick
- It should handle large strings (I have some scripts that are 10mb in size for example)
- The hardest part (that the above code currently does not do) is to take quoted text into account
Currently the following SQL will incorrectly get split:
var batch = QueryBatch.Parse(@"-- issue...
insert into table (name, desc)
values('foo', 'if the
go
is on a line by itself we have a problem...')");
Assert.That(batch.Queries.Count, Is.EqualTo(1), "This fails for now...");
I have thought about a token based parser that tracks the state of the open closed quotes but am not sure if Regex will do it.
Any ideas!?