ansaurus

Question

Pros/cons to using char for small integers in C

Answer 1

+6 A:

Well, the first issue is that it's not defined by the C standard whether plain char is signed or unsigned - so the only range you can portably rely on is 0 to 127.

Other than that, in general int is supposed to be the type corresponding to the native word size of the architecture (but of course this isn't enforced by anything). This would tend to be the type with the best arithmetic performance, but that's about all you can say.

Note that operands narrower than int are widened either to int or unsigned int during expression evaluation anyway.

caf 2009-12-06 12:15:52

yep, I've been told that a int has the size of a processor register, so if you use a char, the register will not be full but will still use the same time to make basical operations be the register fully used or not.

Aif 2009-12-06 12:25:45

In addition, int8_t and uint8_t would be better types to use than char.

Mike Weller 2009-12-06 12:41:35

@Aif: As far as I know, it takes even *more* time to do the up/down conversion.

Carl Smotricz 2009-12-06 13:09:06

@Carl: Got any data on this?

int3 2009-12-06 13:13:23

I came across these 2: <http://www.eventhelix.com/realtimemantra/basics/optimizingcandcppcode.htm#Prefer int over char and short> and <http://en.wikibooks.org/wiki/Optimizing_C%2B%2B/Writing_efficient_code/Performance_improving_features> but I don't consider either authoritative. Sorry, no hard proof, looks more like anecdotal.

Carl Smotricz 2009-12-06 14:18:14

Answer 2

A:

The main con I would see is that your code is using a type that means one thing for values that mean something else -- e.g., there's a semantic problem which could be a maintenance problem. If you did it, I'd probably recommend typedefing it:

typedef char REALLYSHORT;

That way, A) It's clearer what you're doing, and B) You can change it easily (e.g., just the one place) if you run into trouble.

Do you have a really good reason not to use int?

T.J. Crowder 2009-12-06 12:17:01

Your semantic point is already used in the only place I've seen this implemented. And no, I have no good reason whatsoever, I'm just curious as to whether it makes a difference.

me_and 2009-12-06 12:22:31

Answer 3

+2 A:

Another con I can think of is that (as far as I know) "modern" processors do all their math in "full" integers, generally 32 bits. So dealing with a char usually means pulling a single byte out of memory, filling up with 0's in transferring to a register, doing something with it and then squeezing only the least significant bits of the result back into memory. Especially if the char is not aligned on a handy boundary, this memory access takes a lot more work to accomplish.

Using char for int is really only useful when you have a lot of numbers (i.e. a large array) and you need to conserve space.

Carl Smotricz 2009-12-06 12:20:33

Answer 4

+1 A:

Internally, processors generally perform arithmetic on machine words. This means that when calculations on other types are performed, although the calculation itself will take the same length of time, depending on the available instruction set extra work may have to be done to read inputs and to coerce calculation results into the target type (e.g. sign-extending/zero-filling, shifting/masking to avoid unaligned memory accesses, etc).

This is why C defines types and operations as it does - the size of int is not mandated by the standard, allowing compiler authors to make it correspond to a machine word, and expression evaluation is defined to promote smaller integer types to int, greatly reducing the number of points at which results must be coerced to some target type.

Valid reasons to use char for storing integer values are when the space really matters that much (not as often as you might think), and when describing some external data format / protocol that you are marshalling data to / from. Expect uses of char to incur a slight loss of performance, especially on hardware such as Cell SPU where only machine word size memory accesses are available so accessing a char in memory requires several shifts and masks.

moonshadow 2009-12-06 12:22:03

Answer 5

+3 A:

Arithmetic on chars will almost certainly actually performed using the same registers as arithmetic on ints. For example:

char c1 = 1;
char c2 = c1 + 2;

The addition compiles to the following with VC++:

00401030   movsx       eax,byte ptr [ebp-4]
00401034   add         eax,2
00401037   mov         byte ptr [ebp-0Ch],al

where eax is a 32-bit register.

There is therefore no adavantage for using chars over ints when it comes to arithmetic performance.

anon 2009-12-06 12:27:49

Yes, but isn't access to a byte slower because of alignment issues? The fact that the access is a single line of ASM doesn't mean there's not more microcode being executed. Do you have any insight on this?

Carl Smotricz 2009-12-06 13:12:28

Guilty as charged on perhaps too much simplification on "filling in with 0's" and "squeezing into a byte". But I do believe something like this is happening, at *some* level.

Carl Smotricz 2009-12-06 13:14:21

I have no idea about the bus access times (which I think is the real issue) for modern processors, I'm afraid - I was answering the question about arithmetic performance.

anon 2009-12-06 13:18:43

Answer 6

+11 A:

C99 added so-called 'fastest' minimum-width integer types to solve this problem. For the range you're interested in, the types would be int_fast8_t and uint_fast8_t, which can be found in stdint.h.

Keep in mind that there might be no performance gain (the increase in memory consumption might even slow things down); as always, benchmark! Don't optimize prematurely or solely on potentially flawed assumptions of what should work.

Christoph 2009-12-06 13:02:51

That's very useful, I didn't know about those, thank you!

me_and 2009-12-06 13:04:59

good to see an answer that cuts to the standard, good work

Matt Joiner 2009-12-06 13:14:59

If you care about portability (like I do where I share code between embedded target and PC) don't use fast types to actually "store" anything (class members, struct members, etc.). Use fast types to iterate, access arrays or as a local cache variable.

MaR 2009-12-06 16:43:41

@MaR: using fixed-sized integer types is not enough to guarantee that structures can be serialized and deserialized across architectures because of padding and byte order; as you'll therefore have to serialize structure members sequentially anyway, I don't think there's anything wrong with using fast types for member variables

Christoph 2009-12-06 18:39:19

@Christoph: well it's not only about serialization, but true - my statement was too strong. It should be rather 'be careful about it' as increased size can hit you hard in certain scenarios.

MaR 2009-12-06 20:20:07

Benchmarking is certainly the moral of this story; I've just updated the question with the results of some benchmarking I've done. Notably, the stdint.h implementation I found for Visual Studio defines uint_fast8_t as something that is far from the fastest type in my tests.

me_and 2010-01-12 18:52:54

ansaurus

tags:

views:

answers:

Pros/cons to using char for small integers in C

related questions