Hello, I've been searching for solution to this problem for quite some time, but I can't figure it out on my own.
So I have bunch of HTML blocks of code, and I want to search for specific string that is contained in one of the inner tags and if there's match I want return it's parent tag value. Here's example"
<li rel="Returns this value">
<some other tags and elements here />
<a class="link"><span>This match</span></a>
</li>
We search for string This match
and it will return Returns this value
. Is this possible in awk? If not, what is easiest way to accomplish this? I do not mind any solution, however awk or similar command-line tool would be prefered. I'm runing on Ubuntu server and I have root access, so if needed I could rely on other languages, such as Ruby, Python, Perl, PHP, and others.
So far I've been able to search for string between the span tags, and return its contents. It could be however be done much easier with simple sed command, so there's not much use for it yet. However, it may be still be useful and may be improved to make what I need it to do, so here goes:
awk 'BEGIN{RS="";FS="</span>"}
/li/{
for(i=1;i<=NF;i++){
if($i ~ /span/){
gsub(/.*span>/,"",$i)
print $i
}
}
}'
When used on above example, it will return This match
. Thanks a lot for suggestions.