views:

191

answers:

5

I am not really interested in security or anything of that nature, but I need some function(s) that allow me to "compress"/"decompress" a string. I have tried Base64, but that has a big issue with the size of the string, it makes it longer. I also know about this Huffman stuff, but that doesn't work either because it too makes it longer (less in terms of memory, it is an integer).

In other words, I want some arbitrary string 'djshdjkash' to be encoded to some other string 'dhaldhnctu'. Be able to go from one to another, and have the new string's length be equal to or less than the original.

Is this possible with Javascript, has it already been done?

  • Needed to clarify, as I said security is not the objective, just to disguise the string and keeps its length (or shorten it). Base64 is the best example, but it makes strings longer. ROT13 is neat, but doesn't cover all ASCII characters, only letters.
+1  A: 

ROT13?

http://en.wikipedia.org/wiki/ROT13

Jens Björnhager
Voted you up, but sounds like he wants ROT47: http://en.wikipedia.org/wiki/ROT13#Variants
mrclay
In that case you could do the classic: add 1 to every byte.You could also rotate the entire string half a byte.
Jens Björnhager
+2  A: 

You need compression, not encoding. Encoding generally adds bits. Google "String Compression Algorithms."

Stefan Kendall
Also note, if your input is short (e.g. `djshdjkash`) most arbitrary input compression algorithms yield larger outputs. Only when you pass a threshold in length do you start seeing compression wins.
mrclay
I should have mentioned this. The example string won't compress well with most general algorithms you'd find on the internet.
Stefan Kendall
A: 

You can use a simple substitution cipher. Here's an example in JavaScript.

Note that there are tools out there to break substitution ciphers. Make sure security isn't an issue here before going down this path.

Eric J.
A: 

Since ROT13 is out because it only affects alphas, why not just implement something across a larger character set. Set up a from array of characters containing your entire printable character set and a to array containing the same characters in a different order.

Then for every character in your string, if it's in the from array, replace it with the equivalent position in the to array.

This yields no compression at all but will satisfy all your requirements (shorter or same length, disguised string).

In pseudo-code, something like:

chfrom = "ABCDEF..."
chto   = "1$#zX^..."
def encode(s1):
    s2 = ""
    foreach ch in s1:
        idx = chfrom.find(ch)
        if idx == -1:
            s2 += ch
        else:
            s2 += chto[idx]
    return s2
def decode(s1):
    # same as encode but swap chfrom and chto.
paxdiablo
+1  A: 

I'm not sure what exactly you want to compress. If it is the length of the string (as seen by String.length(), you could compress two ASCII characters into a Unicode character. So a string like hello, world (12 characters) might result in \u6865\u6c6c\u6f2c\u206f\u6f72\u6c64 (6 characters). You have to be very careful though that you don't generate invalid characters like \uFFFF and that you can always go back from the compressed string to the uncompressed one.

On the other hand, if you want to reduce the length of the string literal, this way is completely wrong. So please clarify under what circumstances you want to compress the strings.

Roland Illig