I've been wondering about how hard it would be to write some Python code to search a string for the index of a substring of the form ${
expr}
, for example, where expr is meant to be a Python expression or something resembling one. Given such a thing, one could easily imagine going on to check the expression's syntax with compile()
, evaluate it against a particular scope with eval()
, and perhaps even substitute the result into the original string. People must do very similar things all the time.
I could imagine solving such a problem using a third-party parser generator [oof], or by hand-coding some sort of state machine [eek], or perhaps by convincing Python's own parser to do the heavy lifting somehow [hmm]. Maybe there's a third-party templating library somewhere that can be made to do exactly this. Maybe restricting the syntax of expr is likely to be a worthwhile compromise in terms of simplicity or execution time or cutting down on external dependencies -- for example, maybe all I really need is something that matches any expr that has balanced curly braces.
What's your sense?
Update:
Thanks very much for your responses so far! Looking back at what I wrote yesterday, I'm not sure I was sufficiently clear about what I'm asking. Template substitution is indeed an interesting problem, and probably much more useful to many more people than the expression extraction subproblem I'm wondering about, but I brought it up only as a simple example of how the answer to my question might be useful in real life. Some other potential applications might include passing the extracted expressions to a syntax highlighter; passing the result to a real Python parser and looking at or monkeying with the parse tree; or using the sequence of extracted expressions to build up a larger Python program, perhaps in conjunction with some information taken from the surrounding text.
The ${
expr}
syntax I mentioned is also intended as an example, and in fact I wonder if I shouldn't have used $(
expr)
as my example instead, because it makes the potential drawbacks of the obvious approach, along the lines of re.finditer(r'$\{([^}]+)\}', s)
, a bit easier to see. Python expressions can (and often do) contain the )
(or }
) character. It seems possible that handling any of those cases might be much more trouble than it's worth, but I'm not convinced of that yet. Please feel free to try to make this case!
Prior to posting this question, I spent quite a bit of time looking at Python template engines hoping that one might expose the sort of low-level functionality I'm asking about -- namely, something that can find expressions in a variety of contexts and tell me where they are rather than being limited to finding expressions embedded using a single hard-coded syntax, always evaluating them, and always substituting the results back into the original string. I haven't figured out how to use any of them to solve my problem yet, but I do very much appreciate the suggestions regarding more to look at (can't believe I missed that wonderful list on the wiki!). The API documentation for these things tends to be pretty high-level, and I'm not too familiar with the internals of any of them, so I'm sure I could use help looking at those and figuring out how to get them to do this sort of thing.
Thanks for your patience!