I've got an IronPython script that executes a bunch of SQL statements against a SQL Server database. the statements are large strings that actually contain multiple statements, separated by the "GO" keyword. That works when they're run from sql management studio and some other tools, but not in ADO. So I split up the strings using the 2.5 "re" module like so:
splitter = re.compile(r'\bGO\b', re.IGNORECASE)
for script in splitter.split(scriptBlob):
if(script):
[... execute the query ...]
This breaks in the rare case that there's the word "go" in a comment or a string. How in the heck would I work around that? i.e. correctly parse this string into two scripts:
-- this is a great database script! go team go!
INSERT INTO myTable(stringColumn) VALUES ('go away!')
/*
here are some comments that go with this script.
*/
GO
INSERT INTO myTable(stringColumn) VALUES ('this is the next script')
EDIT:
I searched more and found this SQL documentation: http://msdn.microsoft.com/en-us/library/ms188037(SQL.90).aspx
As it turns out, GO must be on its own line as some answers suggested. However it can be followed by a "count" integer which will actually execute the statement batch that many times (has anybody actually used that before??) and it can be followed by a single-line comments on the same line (but not a multi-line, I tested this.) So the magic regex would look something like:
"(?m)^\s*GO\s*\d*\s*$"
Except this doesn't account for:
- a possible single-line comment (
"--"
followed by any character except a line break) at the end. - the whole line being inside a larger multi-line comment.
I'm not concerned about capturing the "count" argument and using it. Now that I have some technical documentation i'm tantalizingly close to writing this "to spec" and never having to worry about it again.