tags:

views:

117

answers:

3

I am not very good at c at all. I have tried to capture the essense of what the C code does in C# using this wikipedia article. However my version is totally different and does not achive the same compression that the C code does. Therefore, I'd like to port the following code from C to C#. However, i do not need it to read/write to files.

I am not familiar with how file reading and writing works in C. Therefore all of the file related fluff confuses me. Also these lines are a little confusing: token = i - 1 | 0x80; and length = (token & ~0x80) + 1;

/******************************************************************************
 * LoadRLE / SaveRLE - Load and save binary data using RLE compression.
 *  Run-length tokens have a set MSB, while data tokens have a cleared
 *  MSB. The value of the token's remaining bits plus one indicates the
 *  length of the block. The minimum run length is three bytes, while
 *  the maximum is 128.
 *
 *  data - Array holding data to load or save.
 *  size - Size of the data array.
 *  file - The file pointer to use.
 *  return - Total number of bytes read from or written to data[].
 */
size_t SaveRLE (unsigned char data[], size_t size, FILE *file)
{
    unsigned char token;
    unsigned int i;
    size_t total = 0;

    while(size)
    {
        /*This loop identifies blocks of repeating data:*/
        i = 2;
        while(i < size && i < 128 &&
            data[i] == data[i - 1] && data[i - 1] == data[i - 2])
            i++;
        /*If repeating data was found, save it:*/
        if(i > 2){
            token = i - 1 | 0x80;
            if(!fwrite(&token, 1, 1, file))
                return total;
            if(!fwrite(data, 1, 1, file))
                return total;
            data += i, size -= i, total += i;
        }

        /*This loop identifies blocks of non-repeating data:*/
        i = 0;
        while(i < size && i < 128 && (i + 2 > size ? 1 :
            data[i] != data[i + 1] || data[i + 1] != data[i + 2]))
            i++;
        /*If non-repeating data was found, save it:*/
        if(i){
            token = i - 1;
            if(!fwrite(&token, 1, 1, file))
                return total;
            if(fwrite(data, 1, i, file) != i)
                return total;
            data += i, size -= i, total += i;
        }
    }

    return total;
}

size_t LoadRLE (unsigned char data[], size_t size, FILE *file)
{
    unsigned char token;
    unsigned int length;
    size_t total = 0;

    while(size && fread(&token, 1, 1, file)){
        length = (token & ~0x80) + 1;
        if (length > size)
            return total;
        if(token & 0x80){
            if(!fread(&token, 1, 1, file))
                return total;
            memset(data, token, length);
        }else{
            if(fread(data, 1, length, file) != length)
                return total;
        }
        data += length, size -= length, total += length;
    }
    return total;
}

Any help is greatly appreciated.

+1  A: 
token = i - 1 | 0x80;

The | symbol is a bitwise OR, so it is combining i - 1 with 0x80 (hex for 128). I will leave the research into bitwise operations to you.

length = (token & ~0x80) + 1;

The & is a bitwise AND, while ~ negates the following value (flips the 1s and 0s). So that:

~1111000 = 00001111

Incidentally, all these operators are in C# and work in more or less the same way.

Quick Joe Smith
+3  A: 

For your file question, I would strongly suggest consulting the C standard library documentation.

fread fwrite

token = i - 1 | 0x80; 

i minus 1, | does a bitwise OR operation, in this case setting the 8th bit in token.

length = (token & ~0x80) + 1;

token & ~0x80 takes the NOT of 0x80 (all bits but the high bit) and does a bitwise AND (bit is set when both bits are set). In this case, it returns every but but the 8th bit.

As for what this means in your case, look at some articles about RLE.

Yann Ramin
Thankyou for clearing some things up for me. Those links helped. However, I should have explained myself a little more. I am not confused with the bitwise operations. I am confused about the choice of 128 or 0x80 and why it is or'd with i-1.
Mike
That is specific to their implementation of RLE: http://en.wikipedia.org/wiki/Run-length_encoding .. It looks like the high bit is used to signal a repetition token in this version.
Yann Ramin
+1  A: 
adf88