ansaurus

Question

Answer 1

+4 A:

There's not much to tell. It reads a file one byte at a time, adjusts the value of each byte by an arbitrary value (specified via the -s flag), and writes out the adjusted bytes. It's the binary equivalent of ROT-13 encryption of a text file.

The rest of the details are specific to how Perl does those things. getopts() is a function (from the Getopt::Std module) that processes command-line switches. binmode() puts the filehandles in raw mode to bypass any of the magic that Perl normally does during I/O. The sysread() and syswrite() functions are used for low-level stream access. The pack() and unpack() functions are used to read and write binary data; Perl doesn't do native types.

This would be trivial to re-implement in C. I'd recommend doing that (and binding to it from C# if need be) rather than porting to C# directly.

Michael Carman 2009-05-15 04:05:28

Thanks. That is helpful. I guess the part I don't understand is what type of shifting it does. Does it take a byte array like this: byte[] {1,2,3,4,5} and (shifted by one) produce this: byte[] {5,1,2,3,4}?Or does it shift the bits of each byte, turning: byte[]{00000001,00000010,00000011} into (shifting by one): byte[] {10000000,00000001,10000001}?

Andrew 2009-05-15 04:12:37

Calling this a "shift" is kind of a misnomer. It doesn't move bits or bytes. It applies an offset to the value of each byte. If your original data had byte values of 1, 2, 3 and you specified "-s 5" the result would be 6, 7, 8.

Michael Carman 2009-05-15 04:24:09

So it adds to the byte value? So with a shift of 1, 00000001 becomes 00000010, 00001000 becomes 00001001, and so on?

Andrew 2009-05-15 05:37:19

@Andrew: That's right. Note also that the values wrap around. i.e. 0xFE + 0x04 = 0x02. This makes the transformation reversible.

Michael Carman 2009-05-15 14:32:28

Thanks - that's exactly the explanation I needed.

Andrew 2009-05-15 15:04:41

Answer 2

+1 A:

What the code does is this: Read each byte from standard input one by one (after switching it into raw mode so no translation occurs). The unpack gets the byte value of the character just read so that a '0' read turns into 0x30. The latin1 encoding is selected so that this conversion is consistent (e.g. see http://www.cs.tut.fi/~jkorpela/latin9.html).

Then the value specified on the command line with the -s option is added to this byte along with 512 to simulate a modulus operation. This way, -s 0, -s 256 etc are equivalent. I am not sure why this is needed because I would have assumed the following pack took care of that but I think they must have had good reason to put it in there.

Then, write the raw byte out to standard input.

Here is what happens when you run it on a file containing the characters 012345 (I put the data in the DATA section):

E:\Test> byteshift.pl -s 1 | xxd
0000000: 3132 3334 3536 0b                        123456.

Each byte value is incremented by one.

E:\Test> byteshift.pl -s 257 | xxd
0000000: 3132 3334 3536 0b                        123456.

Remember 257 % 256 = 1. That is:

$byte += $opt_s;
$byte %= 256;

is equivalent to the single step used in the code.

Much later: OK, I do not know C# but here is what I was able to piece together using online documentation. Someone who knows C# should fix this:

using System;
using System.IO;

class BinaryRW {
    static void Main(string[] args) {
        BinaryWriter binWriter = new BinaryWriter(
                Console.OpenStandardOutput()
                );
        BinaryReader binReader = new BinaryReader(
                Console.OpenStandardInput()
                );

        int delta;

        if ( args.Length < 1 
                || ! int.TryParse( args[0], out delta ) )
        {
            Console.WriteLine(
                    "Provide a non-negative delta on the command line"
                    );
        } 
        else {       
            try  {
                while ( true ) {
                    int bin = binReader.ReadByte();
                    byte bout = (byte) ( ( bin + delta ) % 256 );
                    binWriter.Write( bout );
                }
            }

            catch(EndOfStreamException) { }

            catch(ObjectDisposedException) { }

            catch(IOException e) {
                Console.WriteLine( e );        
            }

            finally {
                binWriter.Close();
                binReader.Close();

            }
        }
    }
}

E:\Test> xxd bin
0000000: 3031 3233 3435 0d0a 0d0a                 012345....

E:\Test> b 0 < bin | xxd
0000000: 3031 3233 3435 0d0a 0d0a                 012345....

E:\Test> b 32 < bin | xxd
0000000: 5051 5253 5455 2d2a 2d2a                 PQRSTU-*-*

E:\Test> b 257 < bin | xxd
0000000: 3132 3334 3536 0e0b 0e0b                 123456....

Sinan Ünür 2009-05-15 13:29:50

I think the 512 is supposed to be a bias to force the value to wrap instead of saturating. I don't think it's necessary, though (at least not in Perl).

Michael Carman 2009-05-15 14:47:55

Thank you! That works perfectly. I'm not going to be using this from the command line, but for others that find this question, there is one bug in your code:You should add `args.Length < 1 || ` to the beginning of your if condition to avoid an "index out of bounds" exception when nothing is entered.

Andrew 2009-05-15 15:40:58

Thanks for catching that.

Sinan Ünür 2009-05-15 15:45:40

Why are you trapping delta < 0? That makes the transformation not (easily) reversible. It can be negative in the original code.

Michael Carman 2009-05-17 14:35:17

Just mental error, I guess. I was focused on getting the syntax right so the program would compile.

Sinan Ünür 2009-05-18 10:50:10

Answer 3

+1 A:

Judging by the other answers the equivalent in C# would look something like this:

using(Stream sIn = new FileStream(inPath))
{
  using(Stream sOut = new FileStream(outPath))
  {
    int b = sIn.ReadByte();
    while(b >= 0)
    {
      b = (byte)b+1; // or some other value
      sOut.WriteByte((byte)b);
      b = sIn.ReadByte();
    }
    sOut.Close();
  }
  sIn.Close();
}

samjudson 2009-05-15 13:36:20

ReadByte returns the value of the byte, or -1 if the end of the stream is reached, so you comment makes no sense.

samjudson 2009-05-16 16:55:27

According to http://msdn.microsoft.com/en-us/library/system.io.binaryreader.readbyte.aspxthe return value of ReadByte is of type System.Byte. According tohttp://msdn.microsoft.com/en-us/library/system.byte.aspx System.Byte"Represents an 8-bit unsigned integer." There is no mention of ReadByte returning -1 if the end of stream is reached. In fact, a simple test program based on what you wrote above crashed with System.IO.EndOfStreamException.

Sinan Ünür 2009-05-20 11:52:15

Well I'm not calling BinaryReader.ReadByte am I, I'm calling Stream.ReadByte. Check the docs: http://msdn.microsoft.com/en-us/library/system.io.stream.readbyte.aspx

samjudson 2009-05-20 12:31:52

D'uh! Sorry about that.

Sinan Ünür 2009-05-20 14:45:36

ansaurus

tags:

views:

answers:

Help with byte shifting

related questions