Validation issues aside, it's useful to be able to remove these characters (which don't display reliably anyway) without necessarily escaping anything else. To this end I added the following function to `lib/helpers.py':
__sgml_invalid = re.compile(r'[\x82-\x8c\x91-\x9c\x9f]', re.UNICODE)
def sgmlsafe(text):
lookup = {
130:"‚", #Single Low-9 Quotation Mark
131: "ƒ", #Latin Small Letter F With Hook
132:"„", #Double Low-9 Quotation Mark
133:"…", #Horizontal Ellipsis
134:"†", #Dagger
135:"‡", #Double Dagger
136: "ˆ", #Modifier Letter Circumflex Accent
137:"‰", #Per Mille Sign
138: "Š", #Latin Capital Letter S With Caron
139:"‹", #Single Left-Pointing Angle Quotation Mark
140: "Œ", #Latin Capital Ligature OE
145:"‘", #Left Single Quotation Mark
146:"’", #Right Single Quotation Mark
147:"“", #Left Double Quotation Mark
148:"”", #Right Double Quotation Mark
149:"•", #Bullet
150:"–", #En Dash
151:"—", #Em Dash
152: "˜", #Small Tilde
153:"™", #Trade Mark Sign
154: "š", #Latin Small Letter S With Caron
155:"›", #Single Right-Pointing Angle Quotation Mark
156: "œ", #Latin Small Ligature OE
159: "Ÿ" #Latin Capital Letter Y With Diaeresis
}
return __sgml_invalid.sub(lambda x: lookup[ord(x.group())], text)
And you can make this available as a filter by editing environment.py
:
config['pylons.app_globals'].mako_lookup = TemplateLookup(
...
imports=[....,'from appname.lib.helpers import sgmlsafe',...]
It should then be available to your templates:
${c.content|n,sgmlsafe}