I have a program that reads in a character array. I need the value of the string in memory to be equal to hex 0x01020304 which are all non-ASCII characters. So the question is, how do I pass non-ASCII characters into a string literal variable at runtime?
Use an escape sequence. Make sure you put the characters in the correct order.
"\x01\x02\x03\x04"
Edit: If you need to put the sequence into an existing char array, simply assign it in.
char s[4];
// ... later ...
s[0] = 0x01;
s[1] = 0x02;
s[2] = 0x03;
s[3] = 0x04;
Do not attempt to assign the number by casting s
to (int32_t *)
, the char array doesn't have the correct alignment.
Well, are you sure you need a string literal?
These are all pretty similar:
const char* blah = "test";
char blah[] = "test";
char blah[] = { 't','e','s','t',0 };
You could certainly use the third form for your needs quite easily.
Probably the easiest, in C, is to use the hex escape notation: "\x01\x02\x03\x04"
. (Without the x, the values are in octal, which isn't nearly as popular or understandable nowadays.)
Alternatively,
char x[] = {1, 2, 3, 4, 0};
should work (notice that the null termination has to be included when initializing like this).
I need the value of the string in memory to be equal to hex 0x01020304 which are all non-ASCII characters.
beware How 4 contigious bytes are laid out in memory will depend if your system is big-endian or little-endian. If you care about how the 32 bit field works, just putting things into a string literal won't work.
For example:
You could try, as avakar suggests:
char cString[5] = "\x01\x02\x03\x04";
or even just do
cString[0] = 0x01;
cString[1] = 0x02;
...
but if you expect the actual physical layout in memory to make sense:
// assuming unsigned int is 32 bits
unsigned int* cStringAlias = rentirpret_cast<int*>(&cString[0]);
std::cout << (*cStringAlias)
Be careful, the output will differ depending on whether the most significant byte is placed in the 0th location or the 3rd location.
The output could be
0x01020304
or
0x04030201
For more, read about endianess.
Save the source in UTF8 and treat all strings as UTF-8 (or use something line StringFromUTF()).
Each time you don't work in an universal code page (yes, UTF-8 is not really a code page...) you are asking for troubles.
You may want to try using std::hex
:
int temp;
char sentMessage[10];
for(int i = 0; i < 10; ++i)
{
std::cin >> std::hex >> temp;
sentMessage[i] = temp;
}
You would then type in the hexadecimal value of each character, eg. 01 11 7F AA
You can use std::wcin
and std::wcout
for unicode support for console. However, I am not sure whether they are part of the standard.
When writing C code, you can use memcpy() to copy binary data:
memcpy(dest + offset, src, 4);
If src is a string, you presumably get it in the right order. If it's an integer (say, uint32_t) and you need a specific endianness, you might need to reverse the order of the bytes before doing memcpy():
uint32_t src;
...
swap((unsigned char *) &src, 0, 3);
swap((unsigned char *) &src, 1, 2);
where swap() is defined by you. You must do this only if the machine endianness doesn't match the desired output endianness.
You can discover the endianness by looking at certain defines set by the compiler or C library. At least on glibc (Linux), endian.h provides such definitions, and byteswap.h also provides byte-swapping functions.