views:

192

answers:

6
typedef unsigned char uChar;
typedef signed char sChar;

typedef unsigned short uShort;
typedef signed short sShort;

typedef unsigned int uInt;
typedef signed int sInt;

typedef unsigned long uLong;
typedef signed long sLong;

I have a list of typedefs so when I define variables, I can be exact. For instance, if I only need the numbers 0-5 I'd use uChar. But I'm working in C++ and am making an engine. I was reading about booleans on .NET taking up X bytes and due to memory aligning it'd be quicker to use ints.

Is there a reason to use int rather than uChar due to memory aligning, performance or such?

+12  A: 

This is the kind of premature optimization that rarely matters much. I'd choose a data structure and get on with it. Once you have a complete system that has a problem, profile it to find out where the issue is. Your chances of guessing and hitting the poor performance nail on the head are small indeed.

duffymo
Yeah, what Duffymo says. It's highly unlikely this will even make it into the top ten performance problems you have for the project.
jeffamaphone
Amen brother . . .
Cliff
Funny that mine is the only answer voted down. Seems like a unanimous opinion, though.
duffymo
I don't like this answer - consistently picking the less efficient datatype over a project is a huge premature pessimization.
kotlinski
@kotlinski: This implies the existence of the word "pessimal". Apparently it exists, but I'd say it's far from common :)
Merlyn Morgan-Graham
@Merlyn: This is very related to the question at hand - http://stackoverflow.com/questions/312003/what-is-the-most-ridiculous-pessimization-youve-seen
kotlinski
If applications are hamstrung by choices like this, no wonder C++ has declined in popularity. I would imagine that an optimization like this would be handled by the compiler.
duffymo
+9  A: 
  • Those typedefs aren't more exact than what they name. They're more terse, but nonstandard.
  • If you want to be more exact, use #include <stdint.h> to get int8_t, uint32_t, etc.
  • If you need to worry about memory alignment, you will find out through other means.
  • If you need to store a large number of Booleans, look into std::bitset and std::vector<bool>.
  • If you need to store one Boolean, use bool!
Potatoswatter
+4  A: 

You really don't want to waste time on stuff that's not in the critical path. Once you have things working, then you should profile and see where the problems are. Then you can speed up the trouble spots.

A system that's fast but doesn't work is worthless. A slow system that works is useful to some people and will become useful to more as it gets faster.

Also remember that an unoptimized but appropriate algorithm will beat a super-optimized poor algorithm nearly every time.

Paul Rubel
+1  A: 

Mock up a prototype and use a profiler on it before you even begin to think about micro-optimization perf. Remember: if it is a constant (or even a small change to a co-efficient), Big-O treats it the same.

In my experience, using unsigned types breaks a lot of common approaches to error checking, and makes it so that you run into integer storage wrapping errors (and bugs) almost immediately, yet at the same time makes it more difficult to reason about the solution.

Also, implicit casts make bugs much more likely when using unsigned types.

For example:

#include<iostream>

void SomeFunction(uint32_t value)
{
  if(value < 0)
  {
    // unreachable code.  What do we do instead?
    throw std::runtime_error("value must be non-negative");
  }
}

uint32_t SomeOtherFunction()
{
  return (uint32_t)2000000000 + (uint32_t)2000000000;
}

int main(int argc, char* argv[])
{
  int someValue = -1;
  SomeFunction(someValue);
  someValue = SomeOtherFunction();
  std::cout << someValue;
}

-294967296

Merlyn Morgan-Graham
+1  A: 

Because the C standard explicitly defines the effects of exceeding the bounds of unsigned types, compilers may have to add extra code to make them behave as indicated. Consequently, it's possible for one data type to be faster for things in memory, and for another type to be faster for things kept in registers. For example, consider the code:

  uInt16 var1;
  Int32 var2;

  var1++;
  var2 = var1;

The ARM processor I use has only has 32-bit instructions for register operations, but which can do 8, 16, and 32-bit loads and stores. If var1 is in memory, it can be operated upon just as nicely as if it were a 32-bit integer, but if it's in a register the compiler will have to add an instruction to clear the upper word before copying it to var2. If var1 were a signed 16-bit integer, loading it from memory would be slower than if it were unsigned (because of the necessary sign extension), but if it were kept in a register the compiler would not be required to worry about the upper bits.

supercat
+1  A: 

Yes, using an int instead of a char will often result in a noticeable performance freebie. Thus the use of int by the C language to try to match the native size of the registers in the processor.

It is a good idea to use unsigned ints wherever possible, and only on the rare/specific occasion, use something other than an unsigned int. Avoid using something smaller than an int unless you have a really good reason. If you are trying to use smaller than an int items to get some freebie performance from a programming habit you need to change your habit the other way. Same goes for unsigned, use unsigned anything unless you have a really good reason to use signed.

Basically disassemble (or compile to asm), look at what your favorite and other compilers are generating, notice the unaligned addressing caused by chars, note the masking of the upper bits, the sign extension for signed chars, etc. These are sometimes free and sometimes not depending on where that byte is coming from and going to and the platform. Also try at a minimum x86 and arm, perhaps mips, gcc 3.x, 4.x and llvm. In particular notice how a single char mixed in with a list of ints in a line of declarations may cause the ints that follow to not be aligned, which is fine for x86 from an address standpoint but will cost in performance (on an x86 even with a cache). Put your aligned variables first then unaligned last. Other platforms that cannot or prefer not to do unaligned accesses will waste the extra bytes as padding, so you are not necessarily saving memory. The premature optimization is trying to tune to the variable length. Use simple habits like use unsigned ints for everything unless you have a specific reason, put your larger, aligned, variables and structures first in a list of declarations, and the unaligned stuff last (shorts then chars).

Multiplies (and divides) make this habit ugly, avoiding multiplies and divides in code is the best habit to have. If you have to use one be quite knowledgeable about its implementation. It is much better to multiply two chars instead of two ints for example (if the numbers support it), so if you do happen to know the ints are really 7 bit or 5 bit or whatever quantities, typecast them down for the multiply and allow a hardware multiply to happen instead of a soft multiply. (can be dormant bug if those variable sizes change!!). Even though many processors have a hardware multiply it is very rare that it can actually be used directly. Unless you help the compiler it has to make a library call to check for overflow among other things, and may end up doing a soft multiply as a result, very costly. Divides are bad because most processors do not include a divide. And if they do you may fall into the same trap. with multiply a N bit * N bit turns into 2*N bits of result which is where the multiply problem comes in. With divide the numbers stay the same or get smaller. In both cases the isas dont always provide enough bits to cover the overflow and a library call is required to work around the processors hardware limitations.

Floating point is a similar story, just be careful with floating point. Dont use it unless absolutely necessary. Most folks dont remember off hand that

float a;
float b;
...
b = a * 1.0;

C assumes double precision unless otherwise specified, so the above multiply requires a to be converted to double, then multiplied then the result converted back to single. Some fpus can do the precision conversion in the same instruction at the cost of clocks, some cannot. Precision conversion is where the majority of your floating point processor errors live (or did). So either use doubles for everything, or be careful with your coding to avoid these pitfalls:

float a;
float b;
...
b = a * 1.0F;

Also most isas do not have an FPU so avoid floating point math even more than you avoid fixed point multiplies and divides. Assume most fpus have bugs. It is difficult to write good floating point code (the programmer often throws away a fair amount of the precision by just not knowing how to use it and write code for it).

A few simple habits and your code runs noticeably faster and cleaner as a freebie. Also the compiler doesnt have to work as hard so you fall into fewer compiler bugs.

EDIT adding a floating point precision example:

float fun1 ( float a )
{
    return(a*7.1);
}

float fun2 ( float a )
{
    return(a*7.1F);
}

the first function contained:
    mulsd   .LC0(%rip), %xmm0
using a 64 bit floating point constant
.LC0
    .long   1717986918
    .long   1075603046

and the second function contains the desired single precision multiply
    mulss   .LC1(%rip), %xmm0
with a single precision constant
.LC1
    .long   1088631603
char fun1 ( char a )
{
    return(a+7);
}
int fun2 ( int a )
{
    return(a+7);
}
fun1:
    add r0, r0, #7
    and r0, r0, #255
    bx  lr
fun2:
    add r0, r0, #7
    bx  lr
dwelch
Wouldn't the compiler optimize the float multiplication and division? I'm not writing in ASM here, it's C++. Also, do you have any sources to back this up? It's not that I don't trust you, but..
Jookia
it shouldnt optimize, in the same way that your program said I want this variable to be a character instead of an int the compiler must honor that. Likewise a constant like 1.0 is a double and the compiler should honor that promoting the whole equation to a double. Same as doing a math operation with an int and char the char is converted to an int then the math occurs. then the result is converted to match whatever the result variable type is.
dwelch
I was talking about writing in C/C++.
dwelch