views:

306

answers:

2

I'm writing a programming language, and when I came across this question, my immediate thought was that languages should optimize booleans into bit flags for the programmer. This would keep all the benefits of speed and efficient memory usage while removing the burden of maintenance and the possibilities for errors caused by the more complicated bit manipulation.

The one case where you might want to not have this optimization is if you have a situation where there is only one set of booleans being stored. Basically, if you have 8 bits for flags + 8 bit masks * 8 bits per bit mask = 72 bits instead of 8 booleans * 8 bits per boolean = 64 bits. But as soon as you have even two copies of the booleans, it becomes 2 copies * 8 bits for flags + 8 bit masks * 8 bits per bit mask = 80 bits versus 2 copies * 8 booleans * 8 bits per boolean = 128 bits. It seems like the few cases where the booleans would be more storage-optimal would be easy to detect so one could not apply the optimization.

Is there any reason that languages don't support this optimization? I looked around and it doesn't seem that any languages do (I may just not be looking in the right places).

A: 

I've seen people do this in assembly language, where they pack booleans into words to save space, and then write lots of instructions to get/set the bits.

Clearly there is a tradeoff in speed and memory between packing booleans and not packing them, and personally I'm shy of a compiler trying to decide that for me.

Mike Dunlavey
Compilers such as C# and Java make much larger decisions for you all the time, and they are right more often than not. I'm just wondering why this particular optimization is never made (even in C# and Java).
Imagist
@imagist: Maybe I'm a "majority of one", but I don't appreciate aggressive optimization by compilers. It's almost always misplaced. Really serious optimization is my job. What I want a compiler to do is generate "good code" but leave the real thinking to me.
Mike Dunlavey
@mike The problem with that thinking is that optimizing compilers these days are faster, more accurate, and more effective than a human could ever be.
Imagist
@imagist: The "problem with that thinking" (forgive me, I'm just a crusty old coder) is comparing what a compiler could ever do to _real_ performance optimization, is like comparing a lightning bug to lightning (Mark Twain). Check this: http://stackoverflow.com/questions/926266/performance-optimization-strategies-of-last-resort/927773#927773
Mike Dunlavey
... I'm sure you could do an excellent job of making your compiler optimize, and I assume if you had to, in a pinch, do the same optimization manually, you would equally well (as I have). However, you or your compiler simply don't have the really big picture, and as that link shows, that's where the performance problems _really_ live.
Mike Dunlavey
... Have you ever seen what Fortran compilers do? Unless you tell them not to, they totally scramble the code trying to elminate jumps, eliminate common subexpressions, optimize registers, etc. etc. Does it hurt? If you're debugging it drives you crazy. Does it help? Only in hotspots. Put a function call in that tight loop, and all the time goes into that, and the loop is no longer a hotspot. And as software gets big, it's hardly anything _other_ than a big nest of function calls.
Mike Dunlavey
A: 

C does ...

#include <stdio.h>

typedef struct foo_s {
    unsigned char field1 :1;
    unsigned char field2 :1;
    unsigned char field3 :4;
    unsigned char field4 :2;
} foo_t;

int main() {
    printf("%d\n", sizeof(foo_t));
    return 0;
}

when run, those four fields get packed into one byte:

$ gcc -Wall -o bittest bittest.c 
$ ./bittest 
1
Sharkey