File1:
<a>hello</b> <c>foo</d>
<a>world</b> <c>bar</d>
Is an example of the file this would work on. How can one remove all strings which have a <c>*</d>
using sed?
File1:
<a>hello</b> <c>foo</d>
<a>world</b> <c>bar</d>
Is an example of the file this would work on. How can one remove all strings which have a <c>*</d>
using sed?
The following line will remove all text from <c>
to </d>
inclusive:
sed -e 's/<c>.*<\/d>//'
The bit inside the s/...//
is a regular expression, not really a wildcard in the same way as the shell uses, so anything you can put in a regular expression you can put in there.
if all your data is like that of the example
# gawk 'BEGIN{FS=" <c>"}{print $1}' file
<a>hello</b>
<a>world</b>