I am trying to write a simple sql parser to break a sql statement into its basic parts. However, I'm having a problem with nested queries. An example illustrates best:
sql = "select * from Customers where id in (select customer_id from Orders where 1=1)"
Set re = New RegExp
re.IgnoreCase = True
re.Pattern = "^(.*)\swhere\s(.*)$"
re.Global = True
Set matches = re.Execute( sql )
If matches.count > 0 Then
Set submatches = matches(0).Submatches
where_part = Trim(submatches(1))
everything_else = Trim(submatches(0))
End If
Response.Write where_part & "<br>"
Response.Write everything_else & "<br>"
Here, I want the where_part to contain "id in (select customer_id from Orders where 1=1)" and everything_else to contain "select * from Customers". That is, I want it to be as greedy as possible, for the second (.*) and selfless for the first. Or put another way, I want the everything_else to contain no where's.
However, I get everything_else = "select * from Customers where id in (select customer_id from Orders" and where_part = "1=1)"
How can I remedy this without resorting to messy non-regexp string parsing?