views:

58

answers:

2

I have a program write save a text file using stdio interface. It swap the 4 MSB with the 4 LSB, except the characters CR and/or LF.

I'm trying to "decode" this stream using a C# program, but I'm unable to get the original bytes.

        StringBuilder sb = new StringBuilder();
        StreamReader sr = new StreamReader("XXX.dat", Encoding.ASCII);
        string sLine;

        while ((sLine = sr.ReadLine()) != null) {
            string s = "";
            byte[] bytes = Encoding.ASCII.GetBytes(sLine);

            for (int i = 0; i < sLine.Length; i++) {
                byte c = bytes[i];
                byte lb = (byte)((c & 0x0F) << 4), hb = (byte)((c & 0xF0) >> 4);
                byte ascii = (byte)((lb) | (hb));

                s += Encoding.ASCII.GetString(new byte[] { ascii });
            }
            sb.AppendLine(s);
        }
        sr.Close();

        return (sb);

I've tried to change encoding in UTF8, but it didn't worked. I've also used a BinaryReader created using the 'sr' StreamReader, but nothing good happend.

     StringBuilder sb = new StringBuilder();
        StreamReader sr = new StreamReader("XXX.shb", Encoding.ASCII);
        BinaryReader br = new BinaryReader(sr.BaseStream);
        string sLine;
        string s = "";

        while (sr.EndOfStream == false) {
            byte[] buffer = br.ReadBytes(1);
            byte c = buffer[0];
            byte lb = (byte)((c & 0x0F) << 4), hb = (byte)((c & 0xF0) >> 4);
            byte ascii = (byte)((lb) | (hb));

            s += Encoding.ASCII.GetString(new byte[] { ascii });
        }
        sr.Close();

        return (sb);

If the file starts with 0xF2 0xF2 ..., I read everything except the expected value. Where is the error? (i.e.: 0xF6 0xF6).

Actually this C code do the job:

            ...
while (fgets(line, 2048, bfd) != NULL) {
    int cLen = strlen(xxx), lLen = strlen(line), i;

    // Decode line
    for (i = 0; i < lLen-1; i++) {
        unsigned char c = (unsigned char)line[i];
        line[i] = ((c & 0xF0) >> 4) | ((c & 0x0F) << 4);
    }

    xxx = realloc(xxx , cLen + lLen + 2);
    xxx = strcat(xxx , line);
    xxx = strcat(xxx , "\n");
}
fclose(bfd);

What wrong in the C# code?

A: 

I guess you should use a BinaryReader and ReadBytes(), then only use Encoding.ASCII.GetString() on the bytesequence after you have swapped the bits.

In your example, you seem to read the file as ascii (meaning, you convert bytes to .NET internal dual-byte code upon read telling it that it is ascii), then convert it BACK to bytes again, as ascii-bytes.

That is unnecessary for you.

jishi
Even with BinaryReader couldn't get right result. :( (See edit)
Luca
A: 

Got it.

The problem is the BinaryReader construction:

StreamReader sr = new StreamReader("XXX.shb", Encoding.ASCII);
BinaryReader br = new BinaryReader(sr.BaseStream);

I think this construct a BinaryReader based on StreaReader which "translate" characters coming from the file.

Using this code, actually works well:

FileInfo fi = new FileInfo("XXX.shb");
BinaryReader br = new BinaryReader(fi.OpenRead());

I wonder if it is possible to read those kind of data with a Text stream reader line by line, since line endings are preserved during "encoding" phase.

Luca
line-endings doesn't have to make sense when treating files as "dumb" bytes, I think therefor you don't have to support for "line-reading" in a BinaryReader. But is that really a problem for you? If you still want to "decode" your whole file, it doesn't make much sense doing it line by line.
jishi