I want to use this regular expression in Python:
<(?:"[^"]*"['"]*|'[^']*'['"]*|[^'">])+>
(from http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/)
def removeHtmlTags(page):
p = re.compile(r'XXXX')
return p.sub('', page)
It seems that I cannot directly substitute the complex regular expression into the above function.