Suppose we have a loop that runs for 100 times. Does using unsigned char
instead of int
for its counter make a difference? And also using i += 1U
instead of i++
? Or do compilers take care of that?
views:
203answers:
9This will usually not make any difference . Try and measure if you really care but first be sure that this is a bottleneck.
As far as I know performance figures change from compiler to compiler. In addition to that performance might vary from OS to OS as well.
Better approach is write a small program to measure the performance of operations you are interested.
Programming Pearls covers a lot of details about performance of primitive operations.Also it shows how to write some programs to measure performance of those operations.
Seriously, leave these sorts of micro-optimisations up to the compiler. Even if there is a difference, it will only become obvious if you're doing it a bazillion times a second. For the vast majority of cases, it won't matter one bit as most programs spend 99.9% of their time waiting around for something to happen.
You're also more likely to get a better return on your investment if you concentrate on the macro stuff like algorithm selection.
Most compiler will recognize that you are using the variable for a loop and so will keep it in a register.
On i386 CPUs it may use an 8-bit register, because they are accessible directly with no performance penalty (16 Bit Values can only be accessed with a "16 bit override opcode" leading to one more byte in the code segment).
However, an 8bit register is as fast as a 32bit one, so the only benefit from using a 8 bit one is that there is one more left for another 8 bit variable (only one, because the higher 16bit cannot be accessed directly).
To summarizes: Let the compiler optimize - it will do it if possible, but in this case it will also make almost no difference.
Other CPUs (PowerPC, ARM etc.) have other behaviour.
If you care about micro optimization like this one, you have to know your compiler by heart. Try both way and compare the assembly generated.
A good rule of thumb is to choose the variable type according to its semantics (char
is for ASCII characters, int
for integers, and so on ...) rather than to its size (except to align your structures).
Two little things though :
On a 32bits architecture, the size of a processor registry is 32bits, so using accessing a
char
(8 bits) takes two operations while accessing anint
is only one.If you do not intend to use the value before incrementation, prefer pre-incrementation to post-incrementation. This is especially important in C++ where post-incrementation can result in some unwanted constructor calls.
In simple case int
vs unsigned char
gives the same code for me:
for ( unsigned char i = 0; i < 100; ++i ) {
01081001 mov esi,64h
01081006 jmp main+10h (1081010h)
01081008 lea esp,[esp]
0108100F nop
std::cout << 1 << std::endl;
01081010 mov eax,dword ptr [__imp_std::endl (108204Ch)]
01081015 mov ecx,dword ptr [__imp_std::cout (1082050h)]
0108101B push eax
0108101C push 1
0108101E call dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (1082044h)]
01081024 mov ecx,eax
01081026 call dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (1082048h)]
0108102C dec esi
0108102D jne main+10h (1081010h)
}
return 0;
You should profile your code and then optimize slow parts. Do not start with premature optimization.
As the difference is going to be (at most) a fraction of a clock-cycle per increment, you will literally need to run the entire loop 1,000,000 before you start to approach the threshold of human perception; and that assumes running the loop 1,000,000 times and then instantaneously alerting the user it has finished.
Assuming the loop does anything else; or, that single notification is any form of IO (which by definition it is), you have to run the (1 million iteration) outer-loop another 1000 thousand times! before your user is going to notice.
Seriously! You are worrying about the wrong things.
Using i += 1U
will make no difference. The compiler will treat it identically to i++
. The case of int
versus unsigned char
is more complex.
The best answer is simply not to worry about it in the first place unless a) your code absolutely must run faster and b) you have identified the loop as a major bottleneck. Remember Knuth's Law: Premature optimization is the root of all evil.
The next-best answer is that the compiler will probably optimize away any difference. Even if it wouldn't, 100 repetitions is peanuts. The loop index operations are totally insignificant.
But in the interests of "full disclosure", as it were, the answer to your int/uchar question is that there is almost certainly no performance difference. Almost. Many of the relevant factors are left unspecified by the C standards, so it is theoretically possible that a C environment could exist where a uchar would be faster.
A char is defined by the standard to be the smallest addressable unit of memory. That's probably (virtually always) going to be one byte of eight bits. An int will be at least as large as a char (or the char obviously couldn't be the smallest addressable unit of memory, and it must be able to hold any value from -32,767 to +32,767. (It is not required to hold -32,768, though virtually all do.) This means that the int must be at least 16 bits. In practice, an int is often a machine word. This has the advantage of being fast. On your machine and mine, it's almost certainly 32 or 64 bits.
Now, suppose we're on a machine with 8-bit words. Suppose that there is no special hardware support for 16-bit integers. (This may or may not correspond to any actual machine.) The char will be 8 bits and the int likely 16. Operations on ints are likely expensive, so a char might be faster.
Suppose we have 32-bit words and 32-bit ints. Suppose your code contains four (8-bit) char variables. Suppose further that your compiler packs local variables in as much as is physically possible, speed be damned. If one of those chars is your loop counter, it may be packed in a word with other variables. Every operation on that char may be costly, as it must be re-packed. (This assumes a rather dumb compiler.) An int counter, however, could not be packed with any other variable. In this case, the int might run slightly faster.
Remember, both of these examples are contrived. I don't know of any real systems that work those ways, but a system could. (Note: If I have presented a case incompatible with the specifications, please tell me so that I can correct it.)
So, in the real world, how does one decide what type to use? From the C Language FAQ:
If you might need large values (above 32,767 or below -32,767), use long. Otherwise, if space is very important (i.e. if there are large arrays or many structures), use short. Otherwise, use int. If well-defined overflow characteristics are important and negative values are not, or if you want to steer clear of sign-extension problems when manipulating bits or bytes, use one of the corresponding unsigned types. (Beware when mixing signed and unsigned values in expressions, though; see question 3.19.)
Although character types (especially unsigned char) can be used as ``tiny'' integers, doing so is sometimes more trouble than it's worth. The compiler will have to emit extra code to convert between char and int (making the executable larger), and unexpected sign extension can be troublesome. (Using unsigned char can help; see question 12.1 for a related problem.)
The rest of that page, and indeed the FAQ as a whole, is well worth the read.
In this case? Use an int and don't worry about it.