I'm trying to understand a paper on lossless compression of floating point numbers and get stuck on one particular step where the authors map a signed integer from a certain range into a range of half the size, losing information that I would think is required. I have a feeling that the authors are using some standard technique which is so obvious to their audience that they don't bother to explain, but which is completely opaque to me.
The value that's being "folded" is the difference between two 23-bit positive integers (the mantissas of a predicted and an actual floating point value) which lies between 1 - 223 and 223 - 1. The authors move the numbers with the highest values (negative and positive) "inwards", so the resulting range is half the size and each number (except 0) maps to two possible values from the original range. This makes me wonder how the process should ever be reversed to determine the original value. In the authors` own words:
We compute the signed corrector that is the shortest modulo 223 and the number
k
that specifies the tightest interval (1-2k , 2k) into which this corrector falls. Next, this numberk
, which ranges between 0 to 22 is compressed [...]. Finally thek + 1
significant bits of the corrector are compressed.
The pseudocode for this is given as:
void comp mantissa(int expo, int a, int p) {
// c will be within [1-2^23 ... 2^23 -1]
int c = a - p;
// wrap c into [1-2^22 ... 2^22 ]
if (c <= -(1<<22)) c += 1<<23;
else if (c > (1<<22)) c -= 1<<23;
// find tightest [1-2^k ... 2^k ] containing c
int k = 0;
// loop could be replaced with faster code
int c1 = (c < 0 ? -c : c);
while (c1) { c1 = c1 >> 1; k++ }
// adjust k for case that c is exactly 2k
if (k && (c == 1<<(k-1))) k--;
// .. further code omitted for brevity
}
Ignoring the actual compression method, the output consists of c
and k
. What I don't get is: How can I restore the original c
from c
and k
when the "wrap c into" part above just maps half of the potential range onto the other half? I tried this on paper with 4 instead of 23 bits and I just don't get it.