tags:

views:

37

answers:

2

I have the following regex:

Regex defineProcedureRegex = new Regex(@"\s*(\bcreate\b\s+\bprocedure\b\s+(?:|dbo\.))(\w+)\s+(?:|(.+?))(as\s+(?:.+?)\s+\bgo\b)\s*", RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.RightToLeft | RegexOptions.Singleline);

that I am running against a SQL script file containing multiple "create procedure" statements. The format of the file is like this:

use databasename

create procedure dbo.procedure_name
    @param1 varchar(10) -- optional
as

-- do stuff

go


use databasename

create procedure dbo.another_procedure
    @param1 varchar(10) -- optional
as

-- do other stuff

go

The problem I have is that I need to match the first as, but only the last go. Since the procedure body may also contain as and go (within comments), I can't figure out how to do this reliably.

Any help/comments/advice?

A: 

Try this:

create procedure (?<schema>.*?)\.(?<name>\w+)(?<params>[\s\S]*?)?as[\s\S]*?go

Note I'm using a non-greedy pattern, i.e. *?

Rubens Farias
+2  A: 

Since the procedure body may also contain as and go (within comments)

(and within string literals and identifiers, yes).

You would have to do a greedy match up to go. But, that would match from the start of the first stored procedure in the source up to the end of the last one!

You could use a negative match to ensure that the greedy match didn't go over a ‘create procedure’ boundary:

(as\s+(?:(?!create\s+procedure).)+?\s+\bgo\b)

however this is still not watertight, because you might also have create procedure in a comment or string literal.

Conclusion: regex is an inadequate tool for parsing a complex, non-regular language such as SQL. You will need a proper SQL parser. This is not a simple job. See eg. this question.

bobince