I agree with keraba that you need to use something like Huffman coding or perhaps the Lempel-Ziv-Welch algorithm. The problem with bit-packing the way you are talking about is that you have two options:
- Pick a constant n such that the largest integer can be represented.
- Allow n to vary from value to value.
The first option is relatively easy to implement, but is really going to waste a lot of space unless all integers are rather small.
The second option has the major disadvantage that you have to convey changes in n somehow in the output bitstream. For instance, each value will have to have a length associated with it. This means you are storing two integers (albeit smaller integers) for every input value. There's a good chance you'll increase the file size with this method.
The advantage of Huffman or LZW is that they create codebooks in such a way that the length of the codes can be derived from the output bitstream without actually storing the lengths. These techniques allow you to get very close to the Shannon limit.
I decided to give your original idea (constant n, remove unused bits and pack) a try for fun and here is the naive implementation I came up with:
#include <sys/types.h>
#include <stdio.h>
int pack(int64_t* input, int nin, void* output, int n)
{
int64_t inmask = 0;
unsigned char* pout = (unsigned char*)output;
int obit = 0;
int nout = 0;
*pout = 0;
for(int i=0; i<nin; i++)
{
inmask = (int64_t)1 << (n-1);
for(int k=0; k<n; k++)
{
if(obit>7)
{
obit = 0;
pout++;
*pout = 0;
}
*pout |= (((input[i] & inmask) >> (n-k-1)) << (7-obit));
inmask >>= 1;
obit++;
nout++;
}
}
return nout;
}
int unpack(void* input, int nbitsin, int64_t* output, int n)
{
unsigned char* pin = (unsigned char*)input;
int64_t* pout = output;
int nbits = nbitsin;
unsigned char inmask = 0x80;
int inbit = 0;
int nout = 0;
while(nbits > 0)
{
*pout = 0;
for(int i=0; i<n; i++)
{
if(inbit > 7)
{
pin++;
inbit = 0;
}
*pout |= ((int64_t)((*pin & (inmask >> inbit)) >> (7-inbit))) << (n-i-1);
inbit++;
}
pout++;
nbits -= n;
nout++;
}
return nout;
}
int main()
{
int64_t input[] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20};
int64_t output[21];
unsigned char compressed[21*8];
int n = 5;
int nbits = pack(input, 21, compressed, n);
int nout = unpack(compressed, nbits, output, n);
for(int i=0; i<=20; i++)
printf("input: %lld output: %lld\n", input[i], output[i]);
}
This is very inefficient because is steps one bit at a time, but that was the easiest way to implement it without dealing with issues of endianess. I have not tested this either with a wide range of values, just the ones in the test. Also, there is no bounds checking and it is assumed the output buffers are long enough. So what I am saying is that this code is probably only good for educational purposes to get you started.