This seems like the hardest problem I have had yet, but maybe I am making it harder than it needs to be. I need to remove an unknown number of nested elements that may or may not be at the beginning of a sentence. The span elements contain a number of words in parentheses. So in the sentence:
(cryptography, slang) An internet firewall.
(cryptography, slang) looks like this:
<span class="ib-brac"><span class="qualifier-brac">(</span></span><span class="ib-content"><span class="qualifier-content">cryptography<span class="ib-comma"><span class="qualifier-comma">,</span></span> <a href="/wiki/Appendix:Glossary#slang" title="Appendix:Glossary">slang</a></span></span><span class="ib-brac"><span class="qualifier-brac">)</span></span>
I was thinking a good solution would be to use regex and nokogiri to check if the first '(' exists or not and if it does, remove all the spans until the closing ')' is reached, but I have no idea how to do this. The solution I am using now does not account for a variable number of spans:
if definition.inner_html =~ /^<span class/
definition.search("span")[0..4].each do |span|
span.remove
end
end