I have the following Python code using Genshi (simplified):
with open(pathToHTMLFile, 'r') as f:
template = MarkupTemplate(f.read())
finalPage = template.generate().render('html', doctype = 'html')
The source HTML file contains entities such as ©
, ™
and ®
. Genshi replaces these with their UTF-8 character, which causes problems with the viewer (the output is used as a stand-alone file, not a response to a web request) that eventually sees the resulting HTML. Is there any way to prevent Genshi from parsing these entities? The more common ones like &
are passed through just fine.