Hello,
I'm looking for a regular expression that will replace strings in an input source code with some constant string value such as "string", and that will also take into account escaping the string-start character that is denoted by a double string-start character (e.g. "he said ""hello""").
To clarify, I will provide some examples of input and expected output:
input: print("hello world, how are you?")
output: print("string")
input: print("hello" + "world")
output: print("string" + "string")
# here's the tricky part:
input: print("He told her ""how you doin?"", and she said ""I'm fine, thanks""")
output: print("string")
I'm working in Python, but I guess this is language agnostic.
EDIT: According to one of the answers, this requirement may not be fit for a regular expression. I'm not sure that's true but I'm not an expert. If I try to phrase my requirement with words, what I'm looking for is to find sets of characters that are between double quotes, wherein even groups of adjacent double quotes should be disregarded, and that sounds to me like it can be figured by a DFA.
Thanks.