views:

656

answers:

5

In looking at URL safe base 64 encoding, I've found it to be a very non-standard thing. Despite the copious number of built in functions that PHP has, there isn't one for URL safe base 64 encoding. On the manual page for [base64_encode()][1], most of the comments suggest using that function, wrapped with strtr():

function base64_url_encode($input)
{
     return strtr(base64_encode($input), '+/=', '-_,');
}

The only Perl module I could find in this area is MIME::Base64::URLSafe (source), which performs the following replacement internally:

sub encode ($) {
    my $data = encode_base64($_[0], '');
    $data =~ tr|+/=|\-_|d;
    return $data;
}

Unlike the PHP function above, this Perl version drops the '=' (equals) character entirely, rather than replacing it with ',' (comma). Equals is a padding character, so the Perl module replaces them as needed upon decode, but this difference makes the two implementations incompatible.

Finally, the Python function urlsafe_b64encode(s) keeps the '=' padding around, prompting someone to put up this function which shows prominently in Google results for 'python base64 url safe':

from base64 import urlsafe_b64encode, urlsafe_b64decode

def uri_b64encode(s):
    return urlsafe_b64encode(s).strip('=')

def uri_b64decode(s):
    return urlsafe_b64decode(s + '=' * (4 - len(s) % 4))

Since there isn't a defined standard, what is the right way?

A: 

Why don't you try wrapping it in a urlencode()? Documentation here.

Fragsworth
That uses an unnecessary number of characters. Why not just urlencode the binary string in the first place?
recursive
+2  A: 

I'd suggest running the output of base64_encode through urlencode. For example:

function base64_encode_url( $str )
{
    return urlencode( base64_encode( $str ) );
}
Jon Benedicto
A: 

If you're asking about the correct way, I'd go with proper URL-encoding as opposed to arbitrary replacement of characters. First base64-encode your data, then further encode special characters like "=" with proper URL-encoding (i.e. %<code>).

Ates Goral
I'm down with using the already available functions, but using urlencode() can add a lot of extra length.
Drew Stephens
+6  A: 

There does appear to be a standard, it is RFC 3548, Section 4, Base 64 Encoding with URL and Filename Safe Alphabet:

This encoding is technically identical to the previous one, except for the 62:nd and 63:rd alphabet character, as indicated in table 2.

+ and / should be replaced by - (minus) and _ (understrike) respectively. Any incompatible libraries should be wrapped so they conform to RFC 3548.

Note that this requires that you URL encode the (pad) = characters, but I prefer that over URL encoding the + and / characters from the standard base64 alphabet.

Grant Wagner
+4  A: 

I don't think there is right or wrong. But most popular encoding is

'+/=' => '-_.'

This is widely used by Google, Yahoo (they call it Y64). The most url-safe version of encoders I used on Java, Ruby supports this character set.

ZZ Coder