Okay, it turns out there seems to be none!
As Yuji pointed out, the underlying encoding of filenames is UTF-8, no matter what. Therefore, one needed to handle two scenarios:
- Arguments that are typed in, character for character, by the user.
- Arguments that are tab-completed or the output of commands like
ls
, as they do not convert any characters.
The second case is simply covered by the assumption of UTF-8.
The first case, however, is problematic:
- On Mac OS 10.6, $LANG contains the IANA-name of the used encoding like
de_DE.IANA_NAME
.
- Prior to Snow Leopard, this is not the case for charsets other than UTF-8!
I didn't test each and every charset I could think of, but none of the european ones were included. Instead, $LANG only was the language-locale (de_DE
in my case)!
Since the results of calling +[NSString stringWithCString:encoding:]
with an incorrect encoding are undefined, you cannot safely assume that it will return nil
in that case* (if eg. it's ASCII-only, it might work perfectly fine!).
What adds to the overall mess is that $LANG
is not guarateed to be around, anyway: There's a checkbox in Terminal.app's preferences, that enables a user to not set $LANG
at all (not to speak of X11.app which doesn't seem to handle any non-ASCII input...).
So what's left:
- Check for presence of
$LANG
. If it's not set, Goto:4!
- Check if
$LANG
contains information on the encoding. If it doesn't, Goto:4!
- Check if the encoding you find there is UTF-8. If it is Goto:6, else...
- If
argc
is greater than 2 and [[NSString stringWithCString: argv[0] encoding: NSUTF8StringEncoding] isEqualToString: yourForceUTFArgumentFlag]
, print that you are forcing UTF-8 now and Goto 6. If not:
- Assume you don't know anything, issue a warning that your user should set the Terminal encoding to UTF-8 and may consider passing
yourForceUTFArgumentFlag
as the first argument and exit().
- Assume UTF-8 and do what you have to...
Sounds shitty? That's because it is, but I can't think of any saner way of doing it.
One further note though:
If you are using UTF-8 as an encoding, stringWithCString:encoding: returns nil whenever it encounters non-ASCII characters in a C-String that is not encoded in UTF-8.)