uint8_t vs unsigned char

tags:

c
typedef

views:

3414

answers:

+2 Q:

uint8_t vs unsigned char

What is the advantage of using uint8_t over unsigned char in C?

I know that on almost every system uint8_t is just a typedef for unsigned char, so why use it?

+8 A:

It documents your intent - you will be storing small numbers, rather than a character.

Also it looks nicer if you're using other typedefs such as uint16_t or int32_t.

Mark Ransom 2009-11-12 22:31:11

The correct type is `uintN_t`, if we're talking about the C99 `stdint.h` types.

Chris Lutz 2009-11-12 22:33:56

It wasn't clear in the original question if we were talking about a standard type or not. I'm sure there have been many variations of this naming convention over the years.

Mark Ransom 2009-11-12 22:50:04

I've fixed my answer, thanks.

Mark Ransom 2009-11-12 22:53:30

Explicitly using `unsigned char` or `signed char` documents the intent too, since unadorned `char` is what shows you're working with characters.

caf 2009-11-12 23:37:53

@caf: If you're lucky enough to get beyond an unadorned 'unsigned' to begin with, which I still see people doing to let the platform pick if its int or char by default. But, I think, in this day and age 'unsigned' (alone, or adorned) indicates the intent adequately, otherwise a simple process of elimination explains it :)

Tim Post 2009-11-29 16:50:36

I thought an unadorned `unsigned` was `unsigned int` by definition?

Mark Ransom 2009-11-29 19:29:42

+1 A:

As you said, "almost every system".

Char is probably one of the less likely to change, but once you start using uin16_t and friends, using uin8_t blends better, and may even be part of a coding standard.

Justin Love 2009-11-12 22:31:53

The correct type is `uintN_t`, if we're talking about the C99 `stdint.h` types.

Chris Lutz 2009-11-12 22:33:21

+2 A:

On almost every system I've met uint8_t == unsigned char, but this is not guaranteed by the C standard. If you are trying to write portable code and it matters exactly what size the memory is, use uint8_t. Otherwise use unsigned char.

atlpeg 2009-11-12 22:32:44

The correct type is `uintN_t`, if we're talking about the C99 `stdint.h` types.

Chris Lutz 2009-11-12 22:37:15

+12 A:

Just to be pendantic, some systems may not have an 8 bit type. According to Wikipedia:

An implementation is required to define exact-width integer types for N = 8[2], 16, 32, or 64 if and only if it has any type that meets the requirements. It is not required to define them for any other N, even if it supports the appropriate types.

So uint8_t isn't guaranteed to exist, though it will for all platforms where 8 bits = 1 byte. Some embedded platforms may be different, but that's getting very rare. Some systems may define char types to be 16 bits, in which case there probably won't be an 8-bit type of any kind.

Other than that (minor) issue, @Mark Ransom's answer is the best in my opinion. Use the one that most clearly shows what you're using the data for.

Also, I'm assuming you meant uint8_t (the standard typedef from C99 provided in the stdint.h header) rather than uint_8 (not part of any standard).

Chris Lutz 2009-11-12 22:36:51

DSPs with `CHAR_BIT > 8` are becoming *less* rare, not more.

caf 2009-11-12 23:36:38

@caf, out of sheer curiosity - can you link to description of some? I know they exist because someone mentioned one (and linked to developer docs for it) in a comp.lang.c++.moderated discussion on whether C/C++ type guarantees are too weak, but I cannot find that thread anymore, and it's always handy to reference that in any similar discussions :)

Pavel Minaev 2009-11-12 23:40:54

"Some systems may define char types to be 16 bits, in which case there probably won't be an 8-bit type of any kind." - and despite some incorrect objections from me, Pavel has demonstrated in his answer that if char is 16 bits, then even if the compiler does provide an 8 bit type, it *must not* call it `uint8_t` (or typedef it to that). This is because the 8bit type would have unused bits in the storage representation, which `uint8_t` must not have.

Steve Jessop 2009-11-13 03:29:35

The SHARC architecture has 32-bit words. See http://en.wikipedia.org/wiki/Super_Harvard_Architecture_Single-Chip_Computer for details.

BruceCran 2009-11-13 16:17:21

And TI's C5000 DSPs (which were in OMAP1 and OMAP2) are 16bit. I think for OMAP3 they went to C6000-series, with an 8bit char.

Steve Jessop 2009-11-13 17:30:47

Oh yes, it was indeed SHARC. Thanks. Looks like a perfect platform for B (the one between BCPL and C) to me :)

Pavel Minaev 2009-11-13 19:39:34

+2 A:

There's little. From portability viewpoint, char cannot be smaller than 8 bits, and nothing can be smaller than char, so if a given C implementation has an unsigned 8-bit integer type, it's going to be char. Alternatively, it may not have one at all, at which point any typedef tricks are moot.

It could be used to better document your code in a sense that it's clear that you require 8-bit bytes there and nothing else. But in practice it's a reasonable expectation virtually anywhere already (there are DSP platforms on which it's not true, but chances of your code running there is slim, and you could just as well error out using a static assert at the top of your program on such a platform).

Pavel Minaev 2009-11-12 22:42:51

For the record, you could make an 8-bit type on any platform: `typedef struct { unsigned i :8; } uint8_t;` but you'd have to use it as `uint8_t x; x.i = ...` so it'd be a bit more cumbersome.

Chris Lutz 2009-11-12 22:45:35

I think chars can go as low as 4 bits, below that and things fall apart a bit in the standard (there is a chance I'm wrong though).

Skizz 2009-11-12 22:48:50

@Skizz - No, the standard requires `unsigned char` to be able to hold values between 0 and 255. If you can do that in 4 bits, my hat is off to you.

Chris Lutz 2009-11-12 22:50:42

"it'd be a bit more cumbersome" - cumbersome in the sense that you'd have to walk (swim, catch a plane, etc) all the way over to where the compiler writer was, slap them in the back of the head, and make them add `uint8_t` to the implementation. I wonder, do compilers for DSPs with 16bit chars typically implement `uint8_t`, or not?

Steve Jessop 2009-11-12 23:06:53

@Steve, no, they don't, since there really isn't any way for them to do that. Bitfield trick does indeed work, but bitfields are very limited (you can't have arrays of them, you can't have pointers to them, etc). There's no requirement in C99 for a standard to have `uint8_t` at all - it must have it if and only if it has a corresponding type. It is, however, required to provide `uint8_least_t`, which is _at least_ 8 bits (but can be larger).

Pavel Minaev 2009-11-12 23:19:20

By the way, on a second thought, it is perhaps the most straightforward way to say "I really need 8 bits" - `#include <stdint.h>`, and use `uint8_t`. If the platform has it, it will give it to you. If the platform doesn't have it, your program will not compile, and the reason will be clear and straightforward.

Pavel Minaev 2009-11-12 23:23:14

I like the logic that if `uint8_t` exists at all, it's going to be `unsigned char` anyway.

caf 2009-11-12 23:41:27

"there really isn't any way for them to do that" - well, it depends how the compiler is coded. You know they're able to generate the code to do 8bit unsigned arithmetic, because of bitfields (probably normal arithmetic, plus some masking). Of course you'd have `sizeof(uint8_t) == sizeof(char)` even though `UCHAR_MAX != 255`, but that's OK, it's why types don't have to use all their storage bits. By "slap in the back of the head" I of course mean "make an impassioned but polite feature request". They're entitled to turn it down, but how confident are they that you won't resort to violence? ;-)

Steve Jessop 2009-11-12 23:43:18

As for "straightforward" - it's certainly the least up-front coding effort, but as you say, for true portability you just have to use `uint8_least_t` and apply the modulo-256 overflow for yourself. I'm guessing you can write it so that on any vaguely optimising compiler where `uint8_least_t` is 8 bits, all the extra ops are elided.

Steve Jessop 2009-11-12 23:50:36

"Of course you'd have sizeof(uint8_t) == sizeof(char) even though UCHAR_MAX != 255, but that's OK, it's why types don't have to use all their storage bits." - it's not okay because `unsigned char` is specifically required to use all storage bits fully by both ISO C and C++. See 6.2.6.1/3 (and the corresponding footnote) for C99, and 3.9.1/1 for C++03.

Pavel Minaev 2009-11-12 23:53:19

It is OK. `unsigned char` (which in this example is 16bit) uses all bits, but AFAIK `uint8_t` doesn't have to. Hence `uint8_t` can be smaller than `unsigned char` in range, although obviously not in storage size. So I don't see why it should be difficult for the compiler writer to support `uint8_t`. It might be monstrously inefficient, but that's a separate issue.

Steve Jessop 2009-11-13 01:16:29

Still no cigar, sorry: "For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits ... If there are N value bits, each bit shall represent a different power of 2 between 1 and 2^(N-1), so that objects of that type shall be capable of representing values from 0 to 2^(N-1) using a pure binary representation ... The typedef name intN_t designates a signed integer type with width N, __no padding bits__, and a two’s complement representation."

Pavel Minaev 2009-11-13 02:38:48

OK, you win :-). 7.18.1.1 conspicuously doesn't say that the unsigned versions have no padding bits. But it's implied by the requirement that if you provide uint8_t then you must provide int8_t, and the lemma: if uint8_t has padding bits, then int8_t has padding bits, since they're the same width and the same storage size.

Steve Jessop 2009-11-13 03:16:21

Moral of the story: integer types are stupid, albeit fast. If you need arithmetic modulo any particular power of two, either write it yourself or use a POSIX-compliant implementation, where uint8_t is compulsory ;-)

Steve Jessop 2009-11-13 03:21:02

If you just need arithmetic modulo, unsigned bitfield will do just fine (if inconvenient). It's when you need, say, an array of octets with no padding, that's when you're SOL. Moral of the story is not to code for DSPs, and stick to proper, honest-to-God 8-bit char architectures :)

Pavel Minaev 2009-11-13 06:10:35

+1 A:

The whole point is to write implementation-independent code. unsigned char is not guaranteed to be a 8-bit type. uint8_t is.

AndreyT 2009-11-12 22:55:21

...if it exists on a system, but that's going to be very rare. +1

Chris Lutz 2009-11-12 22:57:08

That is really important for example when you are writing a network analyzer. packet headers are defined by the protocol specification, not by the way a particular platform's C compiler works.

VP 2010-03-01 18:49:02

ansaurus

tags:

views:

answers:

uint8_t vs unsigned char

related questions