If the source of the SQL statement strings are other coders, you could simply insist that the parts that need changing are simply marked by special escape conventions, e.g., write $TABLE instead of the table name, or $TABLEPREFIX where one is needed. Then finding the places that need patching can be accomplished with a substring search and replacement.
If you really have arbitrary SQL strings and cannot get them nicely marked, you need to somehow parse the SQL string as you have observed. The XML solution certainly is one possible way.
Another way is to use a program transformation system. Such a tool can parse a string for a language instance, build ASTs, carry out analysis and transformation on ASTs, and then spit a revised string.
The DMS Software Reengineering Toolkit is such a system. It has PLSQL front end parser. And it can use pattern-directed transformations to accomplish the rewrites you appear to need. For your example involving select items:
domain PLSQL.
rule use_explicit_column(e: expression):select_item -> select_item
"\e" -> "\e \column\(\e\)".
To read the rule, you need to understand that the stuff inside quote marks represents abstract trees in some computer langauge which we want to manipulate. What the "domain PLSQL" phrase says is, "use the PLSQL parser" to process the quoted string content, which is how it knows. (DMS has lots of langauge parsers to choose from). The terms
"expression" and "select_item" are grammatical constructs from the language of interest, e.g., PLSQL in this case. See the railroad diagrams in your PLSQL reference manual.
The backslash represents escape/meta information rather than target langauge syntax.
What the rule says is, transform those parsed elements which are select_items
that are composed solely of an expression \e, by converting it into a select_item consisting of the same expression \e and the corresponding column ( \column(\e) ) presumably based on position in the select item list for the specific table. You'd have to implement a column function that can determine the corresponding name from the position of the select item. In this example, I've chosen to define the column function to accept the expression of interest as argument; the expression is actually passed as the matched tree, and thus the column function can determine where it is in the select_items list by walking up the abstract syntax tree.
This rule handles just the select items. You'd add more rules to handle the other various cases of interest to you.
What the transformation system does for you is:
- parse the language fragment of interest
- build an AST
- let you pattern match for places of interest (by doing AST pattern matching)
but using the surface syntax of the target langauge
- replace matched patterns by other patterns
- compute aritrary replacements (as ASTs)
- regenerate source text from the modified ASTs.
While writing the rules isn't always trivial, it is what is necessary if your problem
is stated as posed.
The XML suggested solution is another way to build such ASTs. It doesn't have the nice pattern matching properties although you may be able to get a lot out of XSLT. What I don't know is if the XML has the parse tree in complete detail; the DMS parser does provide this by design as it is needed if you want to do arbitrary analysis and transformation.