views:

175

answers:

4

I know it is possible to assign an unsigned char to an unsigned short, but I would like to have more control how the bits are actually assigned to the unsigned short.

unsigned char UC_8;
unsigned short US_16;

UC_8 = 0xff;
US_16 = (unsigned char) UC_8;

The bits from UC_8 are now placed in the lower bits of US_16. I need more control of the conversion since the application I'm currently working on are safety related. Is it possible to control the conversion with bit operators? So I can specify where the 8 bits from the unsigned char should be placed in the bigger 16 bit unsigned short variable.

My guess is that it would be possible with masking combined with some other bit-operator, maybe left/right shifting.

UC_8 = 0xff;
US_16 = (US_16 & 0x00ff) ?? UC_8; // Maybe masking?

I have tried different combinations but have not come up with a smart solution. I'm using ansi C and as said earlier, need more control how the bits actually are set in the larger variable.

EDIT: My problem or concern comes from a CRC generating function. It will and should always return an unsigned short, since it will sometimes calculate an 16 bit CRC. But sometimes it should calculate a 8 bit CRC instead, and place the 8 bit on the eight LSB in the 16 bit return variable. And on the eight MSB should then contain only zeros.

I would like to say something like:

US_16(7 downto 0) = UC_8; 
US_16(15 downto 8) = 0x00;

If I just typecast it, can I guarantee that the bits always will be placed on the lower bits in the larger variable? (On all different architectures)

+3  A: 

What do you mean, "control"?

The C standard unambiguously defines the unsigned binary format in terms of bit positions and significance. Certain bits of a 16-bit variable are "low", by numerical definition, and they will hold the pattern from the 8-bit variable, the other bits being set to zero. There is no ambiguity, no wiggle room, and nothing else to control.

Potatoswatter
Is this true for all architectures? I mean some systems are in big-endian and others are in little-endian. If I assign an smaller variable to a larger, will ansi C guarantee that the smaller variable are placed in the lower bits, on all architectures?
Jesper Melin
@Jesper: Yes, when I say the standard says so, I mean the standard says so. See C99 §6.2.6.2, "Integer types." Endianness is irrelevant; it is a property of addressable memory access, not registers.
Potatoswatter
@Jesper: yes, C guarantees that unsigned integer operations obey the mathematical rules of modulo arithmetics. Endianness only comes into play if you try to view the same fragment of memory as both (say) a short and two bytes. To put it another way, the smaller value will always end up in the lower *bits* of the larger variable, but there is no guarantee about which *bytes* it will occupy.
Gilles
Thanks, wonder how it looks in the C89 standard, need to check that out.
Jesper Melin
@Jesper Melin: The assignment operator deals in numerical values, not representations. In your example code, `UC_8` has the numerical value `0xff` (255 decimal); after a simple assignment `US_16 = UC_8;`, `US_16` will have the same value (exactly as if you had written `US_16 = 0xff;` or `US_16 = 0x00ff;` or `US_16 = 255;`
caf
@Caf: That's great to know, I believed the opposite. So right now I'm trying to find the C89 standard, so I have a reference that says this is true.
Jesper Melin
@Jesper Melin: I'm not sure C89 is available through official channels anymore, since it has been superceded. If you get a C99 reference, the relevant parts are sections 6.5.16.1 and 6.3.1.3.
caf
Is this also true if I assign a larger variable to a smaller? I wish to divide a 32 bit unsigned long into 8 bit unsigned char. Currently I assign the unsigned long to the unsigned char variable, and then right-shift the unsigned long 8 bits and repeat the procedure.
Jesper Melin
@Jesper: As caf said, assignment treats the values as numbers. If you assign a number greater than 255 to an unsigned char, the high bits will be truncated. You need to shift and repeat as you describe. Once the data is in an array of `char`, there are no possible endian issues.
Potatoswatter
A: 

Maybe rotation of bits will help you:

US_16 = (US_16 & 0x00ff) | ( UC_8 << 8 );

Result in bits will be:
C - UC_8 bits
S - US_16 bits
CCCC CCCC SSSS SSSS, resp.: SSSS SSSS are last 8 bits of US_16

But if UC_8 was 1 and US_16 was 0, then US_16 will be 512. Are you mean this?

US_16 = (US_16 & 0xff00) | ( UC_8 & 0x00ff );
Miro
I tried your last example, and that gave me the desired result, at least they end up at the correct place :) But I need to test some more just yo make sure that I understand what is actually happening.
Jesper Melin
A: 

If it is important to use ansi C, and not be restricted to a particular implementation, then you should not assume sizeof(short) == 2. And why bother to cast an unsigned char to an unsigned char (the same thing)? Although probably safe to assume char is 8 bits nowadays, even though that's not guaranteed.

uint8_t UC_8;
uint16_t US_16;
int nbits = ...# of bits to shift...;
US_16 = UC_8 << nbits;

Obviously, if you shift more than 15 bits, it may not be what you want. If you need to actually rearrange the bits, rather than just shift them to some position, you'll have to set them individually

int sourcebit = ...0 to 7...;
int destinationbit = ...0 to 15...;
// set
US_16 |= (US_8 & (1<<sourcebit)) << (destinationbit - sourcebit);
// clear
US_16 &= ~((US_8 & (1<<sourcebit)) << (destinationbit - sourcebit));

note: just wrote, didn't test. probably not optimal. blah blah blah. but something like that will work.

paul
I will try your example, thanks.
Jesper Melin
as to your question edit: yes, if you just straight up assign an 8 bit integer to a 16 bit integer, it will be in the 8 least significant bits. Whether that is big- or little-endian is another matter.
paul
I'm glad if that works, and is defined by the ANSI C (C89) standard. The project is safety related so all the things I implement need some kind of reference that is correct and not against any rules.
Jesper Melin
paul
Thanks for some great answers. Sounds like it shouldn't be any problems to solve this in a controlled manner.
Jesper Melin
A: 
US_16=~-1|UC_8;

Is this what you want?

Is that a joke? `~-1` is zero on a two's complement machine but implementation-defined overall and may be all ones.
Potatoswatter
That also gave me the correct result. I have never used ~ before, how does your example solve my problem (just curious)?
Jesper Melin
@Potatoswatter: Okay, if that is the case then I cannot use this solution since the final target is unknown.
Jesper Melin
~-1 clears all bits and | fill the lower 8 bits; what is the problem?
`~-1` does not clear all bits on a one's complement or signed magnitude machine. `0` *is* guaranteed to clear all (non-padding) bits. Why would you represent all zeroes with something other than `0`? Then, what does `|` with 0 accomplish? Please tell me you're joking.
Potatoswatter