This question has 2 sections one for "single line match" and one for "multi line region matching" Also, I have a semi working solution, I want to find more robustness and elegance in my solution.
- Single Line Match: I would like to duplicate each line of an input file such that the second line was a regex modification of the first: E.G.
File.txt
YY BANANA, YYZ, ABC YHZ YY1
YY APPLE , YYZ, ABC YHZ YY1
YY ORANGE, YYZ, ABC YHZ YY1
YZ GRAPE , YZZ, ABC YHZ YZ1
Would BECOME:
YY BANANA, YYZ, ABC YHZ YY1
XY BANANA, XYZ, ABC YHZ XY1
YY APPLE , YYZ, ABC YHZ YY1
XY APPLE , XYZ, ABC YHZ XY1
YY ORANGE, YYZ, ABC YHZ YY1
XY ORANGE, XYZ, ABC YHZ XY1
YZ GRAPE , YZZ, ABC YHZ YZ1
XZ GRAPE , XZZ, ABC YHZ XZ1
Keep in mind the real file is large, and The example of YY ->XY and YZ ->XZ is exactly correct In other words in my file case YY, YH, YZ, Y1, Y2, Y3 are the symbols that I would like to change to XY, XH, XZ, X1, X2, X3.
I have done something in PERL that is very raw ( will create a link to it as as starting point to show What I was thinking) But the perl script I wrote is not elegant or general and requires multiple passes over the file.
My Raw Stab.... IN PERL. http://www.quantprinciple.com/invest/index.php/docs/tipsandtricks/perl-sed-awk/conditional-duplicate/
Usage of my raw stab:
MatchDuplicate.pl INPUT.txt YY XY > INPUT2.txt
MatchDuplicate.pl INPUT2.txt YH XH > INPUT3.txt
MatchDuplicate.pl INPUT3.txt Y1 X1 > INPUT4.txt
MatchDuplicate.pl INPUT4.txt Y2 X2 > INPUT5.txt
INPUT5.txt is used...
- Multi Line Match Exactly the same as above, but each "record" of the input will match multiple lines:
File.txt
< some starting marker...startRecord:>
data
data
YY data
YY BANANA, YYZ, ABC YHZ YY1
<some ending record marker>
< some starting marker...startRecord:>
data
data
YY data
YY APPLE , YYZ, ABC YHZ YY1
<some ending record marker>
< some starting marker...startRecord:>
data
data
YY data
YY ORANGE, YYZ, ABC YHZ YY1
<some ending record marker>
< some starting marker...startRecord:>
data
data
YZ data
YZ GRAPE , YZZ, ABC YHZ YZ1
<some ending record marker>
Would BECOME:
< some starting marker...startRecord:>
data
data
YY data
YY BANANA, YYZ, ABC YHZ YY1
<some ending record marker>
< some starting marker...startRecord:>
data
data
XY data
XY BANANA, XYZ, ABC YHZ XY1
<some ending record marker>
< some starting marker...startRecord:>
data
data
YY data
YY APPLE , YYZ, ABC YHZ YY1
<some ending record marker>
< some starting marker...startRecord:>
data
data
XY data
XY APPLE , XYZ, ABC YHZ XY1
<some ending record marker>
< some starting marker...startRecord:>
data
data
YY data
YY ORANGE, YYZ, ABC YHZ YY1
<some ending record marker>
< some starting marker...startRecord:>
data
data
XY data
XY ORANGE, XYZ, ABC YHZ XY1
<some ending record marker>
< some starting marker...startRecord:>
data
data
YZ data
YZ GRAPE , YZZ, ABC YHZ YZ1
<some ending record marker>
< some starting marker...startRecord:>
data
data
XZ data
XZ GRAPE , XZZ, ABC YHZ XZ1
<some ending record marker>
My Raw Stab: http://www.quantprinciple.com/invest/index.php/docs/tipsandtricks/perl-sed-awk/multi-line-conditional-duplicate/