views:

286

answers:

2

Hello.

So today I upgraded to bazaar 2.0.2, and I started receiving this message (I'm on snow leopard, btw):

bzr: warning: unknown locale: UTF-8
  Could not determine what text encoding to use.
  This error usually means your Python interpreter
  doesn't support the locale set by $LANG (en_US.UTF-8)
  Continuing with ascii encoding.

very strange, since my LANG is actually empty. Similar thing happen when I try to tinker with the locale module

Python 2.5.4 (r254:67916, Nov 30 2009, 14:09:22) 
[GCC 4.3.4] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getdefaultlocale()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/sbo/runtimes/lib/python2.5/locale.py", line 443, in getdefaultlocale
    return _parse_localename(localename)
  File "/Users/sbo/runtimes/lib/python2.5/locale.py", line 375, in _parse_localename
    raise ValueError, 'unknown locale: %s' % localename
ValueError: unknown locale: UTF-8

exporting LANG does not help

sbo@dhcp-045:~ $ export LANG=en_US.UTF-8
sbo@dhcp-045:~ $ bzr
bzr: warning: unknown locale: UTF-8
  Could not determine what text encoding to use.
  This error usually means your Python interpreter
  doesn't support the locale set by $LANG (en_US.UTF-8)
  Continuing with ascii encoding.

However, this solved the problem

sbo@dhcp-045:~ $ export LANG=en_US.UTF-8
sbo@dhcp-045:~ $ export LC_ALL=en_US.UTF-8

Python 2.5.4 (r254:67916, Nov 30 2009, 14:09:22) 
[GCC 4.3.4] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.getdefaultlocale()
('en_US', 'UTF8')

Could you please explain what's going on here, for better googlability ?

+2  A: 

It's a Mac OS X problem. To see your locale settings, run locale in terminal. locale -a should list all locales that you have defined (that you may use as argument to LC_ALL).

Notice that LC_ALL and other LC_* variables take precedence over LANG when defined.

kaizer.se
More specifically this is a problem with the environment not just with Mac OS X. Linux and other UNIX clones are prone to the same problems if you customize your environment and inadvertently leave things out. Sometimes the problems appear immediately, and other times not until you really need them to work. Yet another symptom of having too many ways to do the same thing...
jathanism
@synack: This is something that has come up before with OS X, which is why *I don't* think it's because Stefano changed his environment.
kaizer.se
@kaiser.se: Why do you think this is a problem and why with OS X? Having an LC_CTYPE or LC_ALL set explains the behavior seen by the OP and is working as documented. The example I gave fails exactly the same way on a current Debian Linux system with the exception that the newer bash on that system actually warns you when exporting LC_CTYPE to the invalid value "UTF-8".
Ned Deily
Hey I realize you can make this fail on POSIX. However, what matters is that it seems to repeatedly come up on OS X, because of default configuration issues http://stackoverflow.com/questions/1629699/locale-getlocale-problems-on-osx http://article.gmane.org/gmane.comp.python.apple/16205
kaizer.se
It certainly can be confusing and there is the issue of the python.org pythons using the 10.4 SDK/libs where locale was kind of broken (something I forgot about when writing my original response). Things are cleaner starting with 10.5. Also, python 3.1 has some cleanup in the locale area, e.g. you won't see mac-roman anymore. The OP, though, is clearly not using a python.org python.
Ned Deily
+3  A: 

If there was no LANG environment variable set, chances are you had either an LC_CTYPE (the key variable) or LC_ALL (which overrides if set) environment variable set to UTF-8, which is not a valid OS X locale. It's easy enough to reproduce with the Apple-supplied /usr/bin/python or with a custom python, as in your case, that was built with the 10.6 SDK (probably also the 10.5 SDK). You won't be able to reproduce it that way with a python.org python; they are currently built with the 10.4 SDK where the locale APIs behave differently.

$ unset LANG
$ env | grep LC_
$ export LC_CTYPE="UTF-8"
$ /usr/bin/python  # Apple-supplied python
Python 2.6.1 (r261:67515, Jul  7 2009, 23:51:51) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale ; locale.getdefaultlocale()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/locale.py", line 459, in getdefaultlocale
    return _parse_localename(localename)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/locale.py", line 391, in _parse_localename
    raise ValueError, 'unknown locale: %s' % localename
ValueError: unknown locale: UTF-8
^D
$ /usr/local/bin/python2.6   # python.org python
Python 2.6.4 (r264:75821M, Oct 27 2009, 19:48:32) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale ; locale.getdefaultlocale()
(None, 'mac-roman')
>>>

EDIT:

There may be another piece to the puzzle. A quick look at the bzr 2.0.1 I have installed indicates that the message you cite should only show up if locale.getpreferredencoding() raises a locale.Error. One way that can happen is if the python _locale.so C extension can't be loaded and that can happen if there are permission problems on it. For example, MacPorts currently is known to have problems setting permissions if you have a customized umask; I've been burned by that issue myself. Check the permissions of _locale.so in the python lib/python2.5/lib-dynload directory and ensure it is 755. The full path for MacPorts should be:

/opt/local/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/
Ned Deily
The python I am using is installed by hand, it's not the standard python installation bundled with OSX. Also, it appears I have no _locale.so or _locale.dylib.... uh ?
Stefano Borini
There should be a _locale.so in lib-dynlib. If not, your python was not built correctly on OS X and locale.py will fall back to some default behaviors.
Ned Deily