views:

1252

answers:

5

I have a long "binary string" like the output of PHPs pack funtion.

How can I convert this value to base62 (0-9a-zA-Z)? The built in maths functions overflow with such long inputs, and BCmath doesn't have a base_convert function, or anything that specific. I would also need a matching "pack base62" function.

+1  A: 

Look at the comments to PHP base_convert, there are some bugfixes and alternative functions mentioned for big numbers.

schnaader
+2  A: 

Unless you really, really have to have base62, why not go for:

base64_encode()
base64_decode()

The only other added characters are "+" and "=", and it's a very well-known method to pack and unpack binary strings with available functions in many other languages.

ruquay
+3  A: 

http://www.pgregg.com/projects/php/base_conversion/base_conversion.php

includes source code for converting from any base to any other base (including base 62) and also copes with arbitrary length and even fractional numbers.

http://www.technischedaten.de/pmwiki2/pmwiki.php?n=Php.BaseConvert
Alix Axel
+2  A: 

I wrote about using the BCMath functions for decimal/binary conversion here: http://www.exploringbinary.com/base-conversion-in-php-using-bcmath/ . You could easily modify it to convert to different bases.

Rick Regan
+5  A: 

I think there is a misunderstanding behind this questions. Base conversion and encoding/decoding are different. The output of base64_encode(...) is not a large base64-number. It's a series of discrete base64 values, corresponding to the compression function.

base64_encode(1234) = "MTIzNA=="
base64_convert(1234) = "TS" //if the base64_convert function existed

base64 encoding breaks the input up into groups of 3 bytes (24 bits), then converts each sub-segment of 6 bits (2^6 = 64, which is the destination base) to the corresponding base64 character (values are "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/", where A = 0, / = 63).

In our example, "1234" becomes MTIZNA==, because (in ASCII) "1234" is 00110001 00110010 00110011 00110100 in binary. This gets broken into 001100 (M) 010011 (T) 001000 (I) 110011 (z) 001101 (N) 00. Since the last group isn't complete, it gets padded with 0's and the value is 000000 (A). Because everything is done by groups of 3 input characters, there are 2 groups: "123" and "4". The last group is padded with ='s to make it 3 chars long, so the whole output becomes "MTIZNA==".

converting to base64, on the other hand, takes a single integer value and converts it into a single base64 value. For our example, 1234 (decimal) is "TS" (base64), if we use the same string of base64 values as above. Working backward, and left-to-right: T = 19 (column 1), S = 18 (column 0), so (19 * 64^1) + (18 * 64^0) = 19 * 64 + 18 = 1234 (decimal).

Jay Dansand