views:

175

answers:

7

Hi all,

I'm writing myself a script which basically lets me send a load of data in a single get request. I'm using base64 to encode it, but its pretty damn long and I'm concerned the URL may get too big.

Does anyone know an alternative, shorter method of doing this? It needs to be decodable when received in a get request, so md5/sha1 are not possible.

Thanks for your time.


Edit: Sorry - I should have explained better: Ok, on our site we display screenshots of websites that get posted up for review. We have our own thumbnail/screenshot server. I'm basically going to be having the image tag contain an encoded string that stores the URL to take a screenshot of, and the width/height of the image to show. I dont however want it in 'raw-text' for the world to see. Obviously base64 can be decided by anyone, but we dont want your average joe picking up the URL path. Really I need to fetch: url, width, height in a single GET request.

A: 

Use an HTTP post instead of a get. Or, try GZipping the data before base64 encoding it.

Matt Ball
Not really possible (I think anyway). Its being used to get an image and its dimensions (which are in a serialised array inside the encoded string). It has to work with a standard html <img> tag.
RickM
+2  A: 

URLs are not meant to be sending long strings of data, encoded or not encoded. After a certain point, when you're dealing with such large amounts of data being sent through the URL you should just start using POST or some form of local storage. FYI, IE has a URL limit of 2038 characters.


EDIT: I don't understand one thing. Why aren't you caching the screen shots? It seems awfully resource intensive to have to take a new screenshot every time somebody views a page with an IMG link to that url.

Maybe your audience is small, and resources are not an issue. However, if it is the opposite and in fact it is a public website-that will not scale very well. I know I'm going beyond what your original question asked, but this will solve your question and more.

As soon as the website is posted up, store the url in some sort of local storage, preferably in sql. I am going to continue this example as if you choose SQL, but of course your implementation is your choice. I would have a primary key, url field, and last_updated timestamp, and optionally an image thumbnail path.

By utilizing local storage, you can now pull the image off a cached copy stored locally on the server every time the page with the thumbnail is requested. A significant amount of resources is saved, and since chances are that those websites aren't going to be updated very often, you can have a cron job or a script that runs every x amount of time to refresh the screenshots in the entire database. Now, all you have to do is directly link (again this depends on your implementation) to the image and none of this huge url string stuff will happen.

OR, just take the easy way and do it client side with http://www.snap.com/

theAlexPoon
A: 

convert_uuencode generates slightly shorter strings than base64.

mario
On the contrary -- a uuencoded file will be *larger* than the equivalent base64, due to the additional length character at the start of every line. Uuencode is also less suitable for use in URLs etc, since it uses more punctuation that would need escaping. The algorithm is otherwise basically the same.
Porculus
A: 

Just don't base64_encode($whole_file). Send the content in chunks and encode the chunks. Also, if you must know how bigger your chunk can get after a call to base64_encode(), it will more than double in size (but less than 2.1*strlen($chunk))

Tom
The ratio is rather 4/3, or 4·ceil(‌‍n/3) if you add the trailing padding.
Gumbo
+1  A: 

You can still use POST for what you describe assuming I understood your correctly, I may not have.

I'm guessing you're doing something like this:

<a href="scripturl?w=11&h=100&url=really-long-secret-base64">
  <img src="imgurl">
</a>

instead do something like this:

<form method="POST" action="scripturl">
  <input type="hidden" name="width" value="100">
  <input type="hidden" name="height" value="100">
  <input type="hidden" name="url" value="secret-url-string-here">
  <input type="image" src="imgurl" name="submit">
</form>
jay.lee
A: 

Is the script that generates the URLs running on a different server from the script that interprets them? If they're on the same server, the obvious approach would be to store the target URL, width, and height in a database, and simply pass a randomly-generated record identifier in the query string.

Porculus
+2  A: 

Since you are only using base64 to obfuscate the string, you could just obfuscate it with something else, like rot13 (or your own simple letter substitution function). So, urlencode(str_rot13($str)) to encode and str_rot13(urldecode($str)) to decode.

Or, to just have a shorter base64-encoded string, you could compress the string before base64 encoding it: base64_encode(gzencode($str, 9)) and gzdecode(base64_decode($str)) to decode.

Or, if this is primarily a security issue (you don't mind people seeing the URL, you just want to keep people from hacking it) you could pass these parameters with normal querystring variables, but with a hash appended to prevent tampering. i.e.:

function getHash($url, $width, $height) {
  $secret = 'abcdefghijklmnopqrstuvwxyz whatever you want etc.';
  return sha1($url . $width . $height . $secret);
}

// So use this hash to to construct your URL querystring:
$hash = getHash($url, $width, $height);
$urlQuerystring = '?url='.urlencode($url).'&width='.(int) $width.
                  '&height='.(int) $height.'&hash='.$hash;

// Then in your code that processes the URL, check the hash first
if ($hash != getHash($url, $width, $height))
  // URL is invalid

(Off topic: People are saying you should use POST instead of GET. If all these URLs are doing is fetching screenshots from your database to display (i.e. a search lookup), then GET is fine and correct. But if calling these URLs is actually performing an action like going to another site, making and storing the screenshot, then that's a POST. As their names suggest, GET is for retrieval; POST is for submitting data. If you were to use GET on an expensive operation like making the screenshot, you could end up DOSing your own site when Google etc. index these URLs.)

joelhardi