views:

513

answers:

6

I am working with Apple's ScriptingBridge framework, and have generated a header file for iTunes that contains several enums like this:

typedef enum {
    iTunesESrcLibrary = 'kLib',
    iTunesESrcIPod = 'kPod',
    iTunesESrcAudioCD = 'kACD',
    iTunesESrcMP3CD = 'kMCD',
    iTunesESrcDevice = 'kDev',
    iTunesESrcRadioTuner = 'kTun',
    iTunesESrcSharedLibrary = 'kShd',
    iTunesESrcUnknown = 'kUnk'
} iTunesESrc;

My understanding was that enum values had to be integer-like, but this definition seems to violate that rule. Furthermore, it seems as though treating these enum values as integers (in an NSPredicate, for example) doesn't do the right thing.

I added the enum declaration above to a C file with an empty main function, and it compiled using i686-apple-darwin9-gcc-4.0.1. So, while these kinds of enums may not conform to the C standard (as Parappa points out below), they are at least being compiled to some type by gcc.

So, what is that type, and how can I use it, for instance, in a format string?

A: 

In C, enums do have to be integer like. You also can't declare strings with single quotes. My guess is that your code sample isn't C. :)

Parappa
That's a good point; I couldn't find anything about these sorts of enums in the C standards. But I modified the question to mention that this code does compile as C with i686-apple-darwin9-gcc-4.0.1, and it's the type that gcc's generating that I'm interested in learning more about.
Evan DiBiase
Actually, single quotes will compile to a char array, and the result will return a pointer to the char array, something that is pretty much an int (Give or take a few bits depending on architecture). Crappy code though.
Bill K
Single quotes indicate characters (possibly in groups) rather than strings. It *is* valid C.
Steve Fallows
@BillK - No those will be character constants (not arrays or pointers) at least in an integer context. Nothing crappy about it.
Steve Fallows
too much guessing and speaking with authority from a position of ignorance. This technique is old-school C. I'd seen it used before Mac.
Andy Dent
If it takes ANY guessing, it's a crappy technique. Document it or use something simpler. Period. Waste of time otherwise.
Bill K
Just to clarify here, "int foo = 'asdf';" is valid C. That is what I did not understand, and that is why the code sample in this question is legal. It's called an "integer constant".
Parappa
+6  A: 

The single quotes indicate characters, rather than strings in C. So each of the enums will have a 32 bit value consisting of the character codes for the four characters. Actual value will depend in character encodings, but I am assuming 8 bit characters. Note there is no appended \0.

You can use the enums for normal comparison/assignment purposes. As with any enum the underlying type is integer.

I've used this technique in embedded systems many times to create 4 character 'names' that were human readable in hex dump/debugger contexts.

Steve Fallows
A: 

That is an Apple extension to C, which It basically translates those enums to:

typedef enum {
    iTunesESrcLibrary = 'k'<<24 | 'L'<<16 | 'i'<<8 | 'b',
 ...
 }

EDIT: Sorry, apparently it's valid C. I've only seem them in Mac code, so wrongly assumed that it was Apple specific.

codelogic
no, it's valid ISO-C!
Christoph
It's not merely an Apple extension. I've used it in several C compilers and the ARM, sec 2.5.2 indicates it's valid C++. Don't have a C spec handy. But you're right about the value interpretation, typically. Officially it's implementation dependent.
Steve Fallows
Thanks, wasn't aware, gcc in Linux does warn about it though.
codelogic
@codelogic: I think the warning is due to the fact that the actual integer value isn't guaranteed to be consistend between compilers/systems. As long as you don't depend on the fact that 'abc' might have the value 6382179, you are safe as far as I know.
Christoph
The warning might be because the value is implementation dependent - hence is non-portable.
Steve Fallows
Yup, that's right. It's documented in the "-Wno-multichar" option in gcc.
codelogic
It is very definitely BYTE ORDER dependent - there are bugs still floating around on OS/X because of QuickTime constants.
Andy Dent
Why is this voted down? Just because codelogic doesn't mention the dependence on endianness?
Christoph
+9  A: 

C99, TC3 reads:

6.4.4.4 §2:

An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in 'x'. [...]

6.4.4.4 §10:

An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined. If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int.

In most implementations, it's safe to use integer character constants of up to 4 one-byte characters. The actual value might differ between different systems (endianness?) though.


This is actually already defined in the ANSI-C89 standard, section 3.1.3.4:

An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in 'x' or 'ab'. [...]

An integer character constant has type int. The value of an integer character constant containing a single character that maps into a member of the basic execution character set is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character, or containing a character or escape sequence not represented in the basic execution character set, is implementation-defined. In particular, in an implementation in which type char has the same range of values as signed char, the high-order bit position of a single-character integer character constant is treated as a sign bit.

Christoph
Good - something official. :)
Steve Fallows
This was a common, but non-portable, non-standard extension before c99, too.
dmckee
@dmckee: multi-byte character constants were already part of ANSI-C89 (see section 3.1.3.4), so it has always been standard!
Christoph
Um. Ah. Learn something everyday. Yeah. Mea Culpa
dmckee
+2  A: 

As already stated, those are integers declared using character constants.

When an integer is declared using a character constant of more than one character, it is sensitive to the byte order of the machine for which the constant was developed. As all the original Mac APIs were on PPC or earlier machines, they are backwards with respect to Intel Little-Endian machines.

If you are only building for Intel you can just reverse the order by hand.

If you are building a Universal binary you need to use a flipping function such as CFSwapInt32BigToHost.

Failure to correct those codes will leave you with code that could only work on PowerPC machines, regardless of the lack of compiler errors.

Andy Dent
A: 

I've been looking for an explanation of how this works for a while - I saw it used in a game engine im developing on that uses the four-character code to map keyboard input to enumerated actions specified in code. It is really sweet to deal with - to define input actions, you just create the enum with four-character constants. The engine takes care of converting the configuration file's four-character string to an integer for simple, easy-to-use input mapping. The only drawback I can think of is the 4-character limit. But this definitely beats the pants off of any other way of string-ing an enum back and forth, such as corresponding ordered string arrays and obfuscated macros, if 4 characters are enough.