tags:

views:

44

answers:

2

I want to transform a line that looks like this:

any text #any text# ===#text#text#text#===#

into:

any text #any text# ===#texttexttext===#

As you can see above I want to remove the # between ===# and ===# The number of # that are supposed to be removed can be any number.

Can I do this with sed?

+1  A: 

sed uses the GNU BRE engine (GNU Basic Regular Expressions), which doesn't have many features that "newer" regex engines have, such as lookaround which would be very handy in solving this.

I'd say you'd have to first match ===#\(.\+\)===# (note that GNU BRE use backslashes to denote capturing groups and quantifiers, and also does not support lazy quantifiers). Then remove any # found in the captured group (a literal search/replace would be enough), and then put the result back into the string. But I'm not a Unix guy, so I don't know if/how that could be done in sed.

Tim Pietzcker
That's exactly what I want to do.
picknick
+2  A: 

Give this a try:

sed 'h;s/[^=]*=*=#\(.*\)/\1/;s/\([^=]\)#/\1/g;x;s/\([^=]*=\+#\).*/\1/;G;s/\n//g' inputfile

It splits the line in two at the first "=#", then deletes all "#" that aren't preceded by an "=", then recombines the lines.

Let me know if there are specific cases where it fails.

Edit:

This version, which is increasingly fragile, works for your new example as well as the original:

sed 'h;s/[^=]*=[^=]*=*=#\(.*\)$/\1/;s/\([^=]\)#/\1/g;x;s/\([^=]*=[^=]*=\+#\).*/\1/;G;s/\n//g' inputfile
Dennis Williamson
works great on my example. I realize now that it have to allow =# to occur before the ===# like such:any text=#any text# ===#text#text#text#===#and that now becomes:any text=#any text ===#texttexttext===#but it should become:'any text=#any# text ===#texttexttext===#is it possible to fix this?
picknick
@nimo9367: See my edit.
Dennis Williamson
works great! thank you very much
picknick