views:

359

answers:

3

I have a string in a file a.txt

{moslate}alho{/moslate}otra{moslate}a{/moslate}

a need to get the string otra using sed.

With this regex

sed 's|{moslate}.*{/moslate}||g' a.txt

a get no output at all but when i add a ? to the regex

s|{moslate}.*?{/moslate}||g a.txt

(I've read somewhere that it makes the regex non-greedy) i get no match at all, i mean a get the following output

{moslate}alho{/moslate}otra{moslate}a{/moslate}

How can i get the required output using sed?

+3  A: 

If you know that the string between moslates will not contain curly braces, you could do this:

sed 's/{moslate}[^{}]*{\/moslate}//g'
Rob Davis
+3  A: 

SED doesn't support non-greedy matching, so you'll need to make the '.*' term less greedy by making it pickier in what it will accept. I don't have a corpus of the kind of things you're looking for, but I'm going to assume that you don't want to find anything with embedded curly brackets. If so, then you could use:

sed 's|{moslate}[^{]*{/moslate}||g' a.txt

which will work in the case you give, but will fail if these things nest.

swestrup
+1  A: 

"need to get" - Based on the context, it would seem that by "get" you mean "remove". However, I would normally interpret "get" to mean "retrieve" or "keep". What your sed command says is "delete everything". What would your desired output look like?

Assuming that you mean "retrieve" or "keep", try this:

sed -n 's|.*{/moslate}\([^{]*\){moslate}.*|\1|p' a.txt

which will retrieve "otra" or whatever is in the position that "otra" occupies in that string (i.e. between two sets of "moslate" tags).

The resulting output:

otra

If you want to remove "otra":

sed 's/otra//' a.txt

Output:

{moslate}alho{/moslate}{moslate}a{/moslate}

If you want to remove whatever is in the position that "otra" occupies in that string (i.e. between two sets of "moslate" tags):

sed -n 's|\(.*{/moslate}\)[^{]*\({moslate}.*\)|\1\2|p' a.txt

Output:

{moslate}alho{/moslate}{moslate}a{/moslate}
Dennis Williamson