tags:

views:

116

answers:

3

For a registration code I want to convert an Int64 to base30 (30 so that only uppercase characters and excluding 0,O,I,1,etc.) and back.

This is not too difficult using functions like:

const
  Base = 30;
  Base30CharSet = '23456789ABCDEFGHJKLMNPRSTVWXYZ';

function ConvertIntToBase30(ANumber: Int64): string;
begin
  if(ANumber = 0) then
    Result := Copy(Base30CharSet, 1, 1)
  else begin
    Result := '';
    while(ANumber <> 0) do begin
      Result := Copy(Base30CharSet, (ANumber mod Base)+1, 1) + Result;
      ANumber := ANumber div Base;
    end;
  end;
end;

function ConvertBase30ToInt(ANumber: string): Int64;
var
  i: integer;
begin
  Result := 0;
  for i := 1 to Length(ANumber) do begin
    Result := Result + (Pos(ANumber[i], Base30CharSet)-1);
    if(i < Length(ANumber)) then
      Result := Result * Base;
  end;
end;

The snag is that I am interested in the Int64's bits, so I could be dealing with a number like $FFFFFFFFFFFFFFFF = -1.

To work around this I thought I would store and remove the sign (abs()) and include the sign as an extra character appended to the base30 result. The problem the occurs at the lower limit of Int64 as calling abs(-9223372036854775808) results in an overflow.

Does anyone have a solution or better algorithm to solve this problem?

A: 

I think you are almost there by considering abs()...

But rather than using abs() why not simply ignore the sign for processing the value of the Int64 itself ? As far as I can tell, you are in fact already doing this so only one minor addition is needed to the encoding routine:

  if aNumber < 0 then
    // negative
  else
    // positive;

The only problem then is the LOSS of sign information in the resulting Base30 string. So treat that as a separate problem to be solved using the new information gained from the aNumber < 0 test...

I see you have excluded all chars that could be confused for 0 or 1 but have also excluded 0 and 1 themselves. You could therefore use 0 and 1 to indicate positive or negative (or vice versa).

Depending on the purpose of these routines, the placement of the 0/1 in the result could be entirely arbitrary (if you wished to obfuscate things and make the placement of the 0/1 random rather than a consistent lead/trail character).

When encoding simply drop a sign indicator into the result string at random, and when decoding handle the 0/1 character whenever as the sign marker it is encountered, but skipped for the purposes of decoding the value.

Of course, if obfuscation is not an issue then simply consistently pre or post fix the sign indicator.

You could even simply choose to use '1' to indicate negative and the LACK of a '1' to indicate/assume positive (this would simplify the zero value case a little I think)

Deltics
I had implemented a similar solution (appending the sign to the result), but the main problem I have is dealing with the bottom range value of Int64. Int64 having the range -9223372036854775808..9223372036854775807, if the value is -9223372036854775808 and sign is removed, Int64 is not big enough to store it (and hence work with it to convert it to base30).
avenmore
Ok, I haven't checked this theory but surely pre-processing the sign indicator and using that to determine whether to ADD or SUBTRACT when decoding would sort that out.
Deltics
A: 

The easy answer is to turn range checking off, even just for the method that you're calling abs in.

If you don't care about an extra char or two you could split the int64 into words or dwords and string those together. I would be more tempted to go to base32 and use bit shifts for speed and ease of use. Then your encoding becomes

Base32CharSet[(ANumber shr 5) % 32]

and a similar pos() based approach for the decode.

moz
I think base32 may be the way to go. But if ANumber < 32, (ANumber shr 5) is 0 and out of range for the char set. Please could you provide more detailed code for the encode function?
avenmore
Base32CharSet[(ANumber shr 5) % 32 + 1] (sorry, you're right, the string runs from index 1..32 so the +1 is necessary)
moz
+1  A: 

The way to deal with it is having a character to indicate it is a negative number so that you can decode back. For negative number, just flip the bit from 1 to 0 and remove the sign bit before encoding and when decode, do a flip back and add the sign bit. Below is working codes

   function InvertIntOff(const ANumberL, ANumberH: Integer): Int64;
    asm
      XOR EAX,$FFFFFFFF
      XOR EDX,$FFFFFFFF
    end;


function InvertIntOn(const ANumberL, ANumberH: Integer): Int64;
asm
  XOR EAX,$FFFFFFFF
  XOR EDX,$FFFFFFFF
  OR  EDX,$80000000
end;

function ConvertIntToBase(ANumber: Int64): string;
const
  CBaseMap: array[0..31] of Char = (
    '2','3','4','5','6','7','8','9', //0-7
    'A','B','C','D','E','F','G','H', //8-15
    'J','K','L','M','N', //16-20
    'P','Q','R','S','T','U','V','X','W','Y','Z'); //21-31
var
  I: Integer;
begin
  SetLength(Result, 15);
  I := 0;

  if ANumber < 0 then
  begin
    Inc(I);
    Result[I] := '1';
    ANumber := InvertIntOff(ANumber and $FFFFFFFF, (ANumber and $FFFFFFFF00000000) shr 32);
  end;

  while ANumber <> 0 do
  begin
    Inc(I);
    Result[I] := CBaseMap[ANumber and $1F];
    ANumber := ANumber shr 5;
  end;

  SetLength(Result, I);
end;

function ConvertBaseToInt(const ABase: string): Int64;
var
  I, Index: Integer;
  N: Int64;
begin
  Result := 0;
  if Length(ABase) > 0 then
  begin
    if ABase[1] = '1' then
      Index := 2
    else
      Index := 1;
    for I := Index to Length(ABase) do
    begin
      case ABase[I] of
        '2'..'9':
          N := Ord(ABase[I]) - Ord('2');
        'A'..'H':
          N := Ord(ABase[I]) - Ord('A') + 8;
        'J'..'N':
          N := Ord(ABase[I]) - Ord('J') + 16;
        'P'..'Z':
          N := Ord(ABase[I]) - Ord('P') + 21;
        else
          raise Exception.Create('error');
      end;
      if I > Index then
        Result := Result or (N shl ((I - Index) * 5))
      else
        Result := N;
    end;

    if ABase[1] = '1' then
      Result := InvertIntOn(Result and $FFFFFFFF, (Result and $FFFFFFFF00000000) shr 32);
  end;
end;

procedure TestBase32;
var
  S: string;
begin
  S := ConvertIntToBase(-1);
  ShowMessage(S + ' / ' + IntToStr(ConvertBaseToInt(S)) + ' ? -1');

  S := ConvertIntToBase(-31);
  ShowMessage(S + ' / ' + IntToStr(ConvertBaseToInt(S)) + ' ? -31');

  S := ConvertIntToBase(1);
  ShowMessage(S + ' / ' + IntToStr(ConvertBaseToInt(S)) + ' ? 1');

  S := ConvertIntToBase(123456789);
  ShowMessage(S + ' / ' + IntToStr(ConvertBaseToInt(S)) + ' ? 123456789');

  S := ConvertIntToBase(-123456789);
  ShowMessage(S + ' / ' + IntToStr(ConvertBaseToInt(S)) + ' ? -123456789');
end;
APZ28
It will take me some time to understand it completely, how ignoring range checking is used as an advantage, but it works perfectly. Thank you very much.
avenmore