views:

421

answers:

1

For a project that I'm working on, I need to implement Burrows-Wheeler's MoveToFront transformation in O(n) space. For some reason, though, my code works on most values that I throw at it, but not all.

My implementation looks something like this:

public byte[] transform (byte[] input)
{
    if (input.length == 0)
         return input;
    IndexedByte[] bytes = new IndexedByte[input.length];
    for (int i = 0; i < input.length; i++)
    {
     bytes[i] = new IndexedByte(input[i],i);
    }
    for (int i = 0; i < input.length -1; i++)
    {
     bytes[i].next = bytes[i+1];
    }
    bytes[input.length - 1].next = bytes[0];
    Arrays.sort(bytes);

    byte[] newBytes = new byte[input.length];
    for (int i = 0; i < bytes.length; i++)
     newBytes[i] = bytes[i].b;

    int[] indexes = new int[input.length];
    for (int i = 0; i < indexes.length; i++)
     indexes[i] = (bytes[i].origIndex + (input.length - 1)) % input.length;
    int x = 0;
    String str = new String(input); 
    for (int i = 0; i < input.length; i++)
    {
     if (bytes[i].origIndex == 0)
     {
      x = i;
      break;
     }
    } 
      byte[] header = intToByteArray(x);
    byte[] result = new byte[indexes.length+header.length];
    for (int i = 0; i < header.length; i++)
     result[i] = header[i];
    for (int i = 0; i < indexes.length; i++)
     result[i+header.length] = input[indexes[i]];
    return result;
}

Any advice on what I'm doing wrong here? It seems that this does not work when a non-alphanumeric character is encountered (i.e. encoding itself, it seems that the /*, etc. will screw it up).

A: 

After running various tests on this code it looks as though it works correctly. The problems you are seeing are probably due to sign extension in the byteArrayToInt implementation. For example, the following code prints -128 rather than the expected 128:

System.out.println(byteArrayToInt(intToByteArray(128)));

Try changing the code to:

private int byteArrayToInt(byte[] b) {
    return (b[0] << 24) + 
          ((b[1] & 0xFF) << 16) + 
          ((b[2] & 0xFF) << 8) +
           (b[3] & 0xFF);
}

As an aside the MAXIMUM = 50000 limit within IndexedByte.compareTo is never reached. I got a java.lang.StackOverflowError with an input array of length 5214. I'd suggest changing this to be iterative rather than recursive (this should be fairly easy since you know the length of the input array and it will also prevent superfluous looping in the pathological case where all bytes in the input array are equal).

Matthew Murdoch