views:

134

answers:

6

Are there short Unicode u"\N{...}" names for Latin1 characters in Python ? \N{A umlaut} etc. would be nice,
\N{LATIN SMALL LETTER A WITH DIAERESIS} etc. is just too long to type every time.
(Added:) I use an English keyboard, but occasionally need German letters, as in "Löwenbräu Weißbier".
Yes one can cut-paste them singly, L cutpaste ö wenbr cutpaste ä ... but that breaks the flow; I was hoping for a keyboard-only way.

+2  A: 

Sorry, no, there's no such thing. In string literals, anyway... you could perhaps piggyback on another encoding scheme, such as HTML:

>>> import HTMLParser
>>> HTMLParser.HTMLParser().unescape(u'a ä b c')
u'a \xe4 b'

But I don't think this'd be worth it.

Hardly anyone even uses the \N notation in any case... for the occasional character the \xnn notation is acceptable; for more involved usage you're better off just typing ä directly and making sure a # coding= is defined in the script as per PEP263. (If you don't have a keyboard layout that can type those diacriticals directly, get one. eg. eurokb on Windows, or using the Compose key on Linux.)

bobince
agree, wouldn't be worth it -- was hoping for \N{auml}.Is there a "compose" on Macs ?
Denis
Not exactly like Compose but apparently you should be able to type `ä` with `Option`+`U` then `A`. http://tlt.its.psu.edu/suggestions/international/accents/codemac.html
bobince
A: 

Have you thought about writing your own converter? It wouldn't be hard to write something that would go through a file and replace \N{A umlaut} with \N{LATIN SMALL LETTER A WITH DIAERESIS} and all the rest.

Mark Ransom
+4  A: 

If you want to do the right thing please use UTF-8 in your python source code. This will keep the code much more readable.

Python is able to real UTF-8 source files, all you have to do is to add an additional line after the first one:

#!/usr/bin/python
# -*- coding: UTF-8 -*-

By the way, starting with Python 3.0 UTF-8 is the default encoding so you will not need this line anymore. See PEP3120

Sorin Sbarnea
ok, but (clarification added) I was hoping for \N{aumlaut} or the like, fast to type and clear too
Denis
+1  A: 

You can put an actual "ä" character in your string. For this you have to declare the encoding of the source code at the top

#!/usr/bin/env python
# encoding: utf-8

x = u"ä"
Roberto Bonvallet
A: 

You can use the Unicode notation \uXXXX do describe that character:

u"\u00E4"
Gumbo
A: 

On Windows, you can use the charmap.exe utility to look up the keyboard shortcut for common letters you're using such as:

ALT-0223 = ß
ALT-0228 = ä
ALT-0246 = ö

Then use Unicode and save in UTF-8:

# -*- coding: UTF-8 -*-
phrase = u'Löwenbräu Weißbier'

or use a converter as someone else mentioned and make up your own shortcuts:

# -*- coding: UTF-8 -*-

def german(s):
    s = s.replace(u'SS',u'ß')
    s = s.replace(u'a:',u'ä')
    s = s.replace(u'o:',u'ö')
    return s

phrase = german(u'Lo:wenbra:u WeiSSbier')
print phrase
Mark Tolonen