char arr[]= "\xeb\x2a";
BTW,are these the same:
"\xeb\x2a"
vs '\xeb\x2a'
char arr[]= "\xeb\x2a";
BTW,are these the same:
"\xeb\x2a"
vs '\xeb\x2a'
It's a special character that indicates the string is actually a hexadecimal number.
http://www.austincc.edu/rickster/COSC1320/handouts/escchar.htm
\x
indicates a hexadecimal character escape. It's used to specify characters that aren't typeable (like a null '\x00'
).
And "\xeb\x2a"
is a literal string (type is char *
, 3 bytes, null-terminated), and '\xeb\x2a'
is a character constant (type is int
, 2 bytes, not null-terminated, and is just another way to write 0xEB2A or 60202 or 0165452). Not the same :)
The \x
means it's a hex character escape. So \xeb
would mean character eb
in hex, or 235
in decimal. See http://msdn.microsoft.com/en-us/library/6aw8xdf2.aspx for ore information.
As for the second, no, they are not the same. The double-quotes, "
, means it's a string of characters, a null-terminated character array, whereas a single quote, '
, means it's a single character, the byte that character represents.
When you say:
BTW,are these the same:
"\xeb\x2a" vs '\xeb\x2a'
They are in fact not. The first creates a character string literal, terminated with a zero byte, containing the two characters who's hex representation you provide. The second creates an integer constant.
\x allows you to specify the character by its hexadecimal code.
This allows you to specify characters that are normally not printable (some of which have special escape sequences predefined such as '\n'=newline and '\t'=tab '\b'=bell)
As other have said, the \x
is an escape sequence that starts a "hexadecimal-escape-sequence".
Some further details from the C99 standard:
When used inside a set of single-quotes ('
) the characters are part of an "integer character constant" which is (6.4.4.4/2 "Character constants"):
a sequence of one or more multibyte characters enclosed in single-quotes, as in
'x'
.
and
An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g.,
'ab'
), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined.
So the sequence in your example of '\xeb\x2a'
is an implementation defined value. It's likely to be the int value 0xeb2a
or 0x2aeb
depending on whether the target platform is big-endian or little-endian, but you'd have to look at your compiler's documentation to know for certain.
When used inside a set of double-quotes ("
) the characters specified by the hex-escape-sequence are part of a null-terminated string literal.
From the C99 standard 6.4.5/3 "String literals":
The same considerations apply to each element of the sequence in a character string literal or a wide string literal as if it were in an integer character constant or a wide character constant, except that the single-quote
'
is representable either by itself or by the escape sequence\'
, but the double-quote"
shall be represented by the escape sequence\"
.
Additional info:
In my opinion, you should avoid avoid using 'multi-character' constants. There are only a few situations where they provide any value over using an regular, old int
constant. For example, '\xeb\x2a'
could be more portably be specified as 0xeb2a
or 0x2aeb
depending on what value you really wanted.
One area that I've found multi-character constants to be of some use is to come up with clever enum values that can be recognized in a debugger or memory dump:
enum CommandId {
CMD_ID_READ = 'read',
CMD_ID_WRITE = 'writ',
CMD_ID_DEL = 'del ',
CMD_ID_FOO = 'foo '
};
There are few portability problems with the above (other than platforms that have small ints or warnings that might be spewed). Whether the characters end up in the enum values in little- or big-endian form, the code will still work (unless you're doing some else unholy with the enum values). If the characters end up in the value using an endianness that wasn't what you expected, it might make the values less easy to read in a debugger, but the 'correctness' isn't affected.