tags:

views:

265

answers:

7
char arr[]= "\xeb\x2a";

BTW,are these the same:

"\xeb\x2a" vs '\xeb\x2a'

+1  A: 

It's a special character that indicates the string is actually a hexadecimal number.

http://www.austincc.edu/rickster/COSC1320/handouts/escchar.htm

badcodenotreat
It's probably best to actually provide the full answer here, rather than linking the OP to a place where they can find it - and that link doesn't have any more explanation than "hexadecimal escape character".
Jefromi
+1  A: 

You could have googled it. Useful website here.

And I quote:

x Unsigned hexadecimal integer

That way, your \xeb is like 235 in decimals.

SeargX
+6  A: 

\x indicates a hexadecimal character escape. It's used to specify characters that aren't typeable (like a null '\x00').

And "\xeb\x2a" is a literal string (type is char *, 3 bytes, null-terminated), and '\xeb\x2a' is a character constant (type is int, 2 bytes, not null-terminated, and is just another way to write 0xEB2A or 60202 or 0165452). Not the same :)

Seth
"type is `char`, 2 bytes" - hm, I don't think that's generally going to fit into a `char`.
Jefromi
@Jefromi True enough, I suppose the type is more accurately described as a `char[2]`. Updated.
Seth
Is `'\xeb\x2a'` the same as `char[0]='\xeb'`,`char[1]='\x2a'`?
`'\xeb\x2a'` does **not** have a type char[2] - it's an `int` with a value that's implementation-defined.
Michael Burr
What about `'\xeb\x2a\xeb\x2a\xeb'` and `'\xeb\x2a\xeb'` then?
@user198729: multibyte character constants are a language extention. Nonetheless, character constants have a type of `int` in C, so providing more than an `int`s worth of data makes little sense anyway.
Evan Teran
@Evan Teran: They're not a language extension; they are allowed in C, it's just that their value is implementation defined.
Charles Bailey
And to answer your question: `int x = '\x01\x02\x03\x04\x05';` yeilds a warning: `warning: character constant too long for its type` which has an implementation defined (perhaps undefined is more accurate?) value.
Evan Teran
@Charles Bailey,can you explain what different implementations are there for `'\xeb\x2a\xeb\x2a\xeb'` and `'\xeb\x2a\xeb'`?
@Evan Teran,so character constant(`'xx..'`) has a limited length(<=`int`,or say 4 characters maximum,`'abcd'`),right?
@user198729: a character constant has a type of `int` no matter how many bytes you shove at it.
Evan Teran
@user198729: No, sorry, I really don't know. That the results are implementation defined makes using them inherently less portable. I've never had a need for them so I've never investigated what any implementations that I use actually specify.
Charles Bailey
@Charles Bailey: perhaps we are confusing terms. The standard speaks about multibyte characters with reference to character sets (as in things like UTF-8 and such). Which is not what is being talked about here. What I am referring to is a character constant which is written like so: `int x = '\x02\x03';` which I do not think is in the standard at all and thus would be a language extension. But I could be wrong too, do you have a reference for why it isn't?
Evan Teran
@user198729: I've found through **basic** tests that gcc tests to use the least significant bytes (at most 4) of a character constant written like you've done. so `int x = '\x01\x02\x03\x04\x05'; printf("%08x\n", x);` yields "02030405"
Evan Teran
@Evan Teran: Please see 6.4.4.4/10. "... The value of an integer character constant containing more than one character..."
Charles Bailey
@Charles Bailey: you are correct: 6.4.4.4p10 says "An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g.,'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined.
Evan Teran
4-byte character constants are extremely handy. (Pre-)Carbon Mac OS used them extensively. 'TEXT' is so much more readable than 0x54455854.
Seth
A: 

The \x means it's a hex character escape. So \xeb would mean character eb in hex, or 235 in decimal. See http://msdn.microsoft.com/en-us/library/6aw8xdf2.aspx for ore information.

As for the second, no, they are not the same. The double-quotes, ", means it's a string of characters, a null-terminated character array, whereas a single quote, ', means it's a single character, the byte that character represents.

Slokun
The single-quoted expression is not a single byte character.
Jefromi
+2  A: 

When you say:

BTW,are these the same:

"\xeb\x2a" vs '\xeb\x2a'

They are in fact not. The first creates a character string literal, terminated with a zero byte, containing the two characters who's hex representation you provide. The second creates an integer constant.

anon
Can you elaborate a little about `'\xeb\x2a'`?
'\xeb' is a single character (=a byte) so '\xeb\x2a' is two bytes = 16bit = short int
Martin Beckett
@Martin Beckett: No, a character constant which contains more than one character is still an `int`, not a `short int`. Its value is implementation defined.
Charles Bailey
@user198729 What Charles said. In C, anything between single quotes is converted to an integer. So in ASCII 'A' is converted to 65. If you say 'AB', then that might get converted to (65 << 8) + 66. Or it might not - the conversion is implementation defined.
anon
@Charles Bailey - sorry, I was trying to say that a two digit \x was a byte and so two of them was a 16bit value (ie a short int). But yes in C even just '0' is an 'integer'.
Martin Beckett
@Martin Beckett: No need to be sorry, I was just trying to correct what looked to me like a point of fact.
Charles Bailey
+1  A: 

\x allows you to specify the character by its hexadecimal code.

This allows you to specify characters that are normally not printable (some of which have special escape sequences predefined such as '\n'=newline and '\t'=tab '\b'=bell)

Loopo
+4  A: 

As other have said, the \x is an escape sequence that starts a "hexadecimal-escape-sequence".

Some further details from the C99 standard:

When used inside a set of single-quotes (') the characters are part of an "integer character constant" which is (6.4.4.4/2 "Character constants"):

a sequence of one or more multibyte characters enclosed in single-quotes, as in 'x'.

and

An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined.

So the sequence in your example of '\xeb\x2a' is an implementation defined value. It's likely to be the int value 0xeb2a or 0x2aeb depending on whether the target platform is big-endian or little-endian, but you'd have to look at your compiler's documentation to know for certain.

When used inside a set of double-quotes (") the characters specified by the hex-escape-sequence are part of a null-terminated string literal.

From the C99 standard 6.4.5/3 "String literals":

The same considerations apply to each element of the sequence in a character string literal or a wide string literal as if it were in an integer character constant or a wide character constant, except that the single-quote ' is representable either by itself or by the escape sequence \', but the double-quote " shall be represented by the escape sequence \".


Additional info:

In my opinion, you should avoid avoid using 'multi-character' constants. There are only a few situations where they provide any value over using an regular, old int constant. For example, '\xeb\x2a' could be more portably be specified as 0xeb2a or 0x2aeb depending on what value you really wanted.

One area that I've found multi-character constants to be of some use is to come up with clever enum values that can be recognized in a debugger or memory dump:

enum CommandId {
    CMD_ID_READ  = 'read',
    CMD_ID_WRITE = 'writ',
    CMD_ID_DEL   = 'del ',
    CMD_ID_FOO   = 'foo '
};

There are few portability problems with the above (other than platforms that have small ints or warnings that might be spewed). Whether the characters end up in the enum values in little- or big-endian form, the code will still work (unless you're doing some else unholy with the enum values). If the characters end up in the value using an endianness that wasn't what you expected, it might make the values less easy to read in a debugger, but the 'correctness' isn't affected.

Michael Burr
Is `'\xeb\x2a'` the same as `char[0]='\xeb',char[1]='\x2a'`?
No, as the answer states, `'\xeb\x2a'` is an `int` with an implementation-defined value. The value is almost certainly either 0xeb2a or 0x2aeb; which one will almost certainly depend on the endianess of the platform. I suppose some compilers might transform `'\xeb\x2a'` to 0xffffeb2a if they decide to do sign extension. I have no idea how likely that might be.
Michael Burr