I've been working on this for a few hours now and can't find any help on it. Basically, I'm trying to strip a SQL string into various parts (fields, from, where, having, groupBy, orderBy). I refuse to believe that I'm the first person to ever try to do this, so I'd like to ask for some advise from the StackOverflow community. :)
To understand what I need, assume the following SQL string:
select * from table1 inner join table2 on table1.id = table2.id
where field1 = 'sam' having table1.field3 > 0
group by table1.field4 order by table1.field5
I created a regular expression to group the parts accordingly:
select\s+(?<fields>.+)\s+from\s+(?<from>.+)\s+where\s+(?<where>.+)\s+having\s+(?<having>.+)\s+group\sby\s+(?<groupby>.+)\s+order\sby\s+(?<orderby>.+)
This gives me the following results:
fields => *
from => table1 inner join table2 on table1.id = table2.id
where => field1 = 'sam'
having => table1.field3 > 0
groupby => table1.field4
orderby => table1.field5
The problem that I'm faced with is that if any part of the SQL string is missing after the 'from' clause, the regular expression doesn't match.
To fix that, I've tried putting each optional part in it's own (...)?
group but that doesn't work. It simply put all the optional parts (where, having, groupBy, and orderBy) into the 'from' group.
Any ideas?