Is there a way that I can add alias to python for encoding. There are sites on the web that are using the encoding 'windows-1251' but have their charset set to win-1251, so I would like to have win-1251 be an alias to windows-1251
+2
A:
>>> import encodings
>>> encodings.aliases.aliases['win_1251'] = 'cp1251'
>>> print '\xcc\xce\xd1K\xc2\xc0'.decode('win-1251')
MOCKBA
Although I personally would consider this monkey-patching, and use my own conversion table. But I can't give any good arguments for that position. :)
Lennart Regebro
2009-06-30 15:04:20
Alex did provide a good argument for that position above. :-)I think the official way is too much work, and would still simply provide my own conversion list, but that is not always feasible
Lennart Regebro
2009-07-01 20:36:52
+3
A:
The encodings
module is not well documented so I'd instead use codecs
, which is:
import codecs
def encalias(oldname, newname):
old = codecs.lookup(oldname)
new = codecs.CodecInfo(old.encode, old.decode,
streamreader=old.streamreader,
streamwriter=old.streamwriter,
incrementalencoder=old.incrementalencoder,
incrementaldecoder=old.incrementaldecoder,
name=newname)
def searcher(aname):
if aname == newname:
return new
else:
return None
codecs.register(searcher)
This is Python 2.6 -- the interface is different in earlier versions.
If you don't mind relying on a specific version's undocumented internals, @Lennart's aliasing approach is OK, too, of course - and indeed simpler than this;-). But I suspect (as he appears to) that this one is more maintainable.
Alex Martelli
2009-06-30 15:24:13
Great point Alex! --- Do no use a module which does not have a great documentation.
Masi
2009-06-30 15:27:22