ansaurus

Question

Answer 1

+8 A:

If you've only got about 10,000 integers then the easiest and most reliably way would probably be a mapping table between the integer and a randomly generated string. Either generate a bunch of random identifiers up-front that correspond to each integer, or just fill them in on demand.

This way you can guarantee no collisions, and don't have to worry about encryption because there's nothing to decrypt as the strings are not derived from the integers themselves.

You could implement it in a database table or in memory (e.g. a two-way dictionary) depending on your needs.

Greg Beech 2010-04-02 07:17:25

+1 Definitely the way to go.

Jon Skeet 2010-04-02 07:35:39

I ended up doing it the way John Leidegren laid out below, which is the same idea just a little more fleshed out. Thanks!

Chris 2010-04-02 19:39:14

Answer 2

A:

Just get a MD5/SHA1 hash of the integer's byte representation. You are guaranteed not to get collisions.

logicnp 2010-04-02 09:04:03

He wants the numbers to be at most 8 characters long, too short for both of these.

AndyC 2010-04-02 09:09:53

Also: crypto hashes used for non-cryptographic needs are over-kill, they simply take too long to process.

Cosmin Prund 2010-04-02 09:21:46

@Cosmin: your comment might be a bit over-generic. Some cryptographic hashes are quite fast. On some architectures, it has been reported that some cryptographic hashes are actually faster than, e.g., simple CRC32.

Thomas Pornin 2010-04-02 14:54:03

This wouldn't be reversible, either, afaik.

Chris 2010-04-02 19:37:42

Answer 3

+1 A:

I derived an idea from Pearson hashing which will work for arbitrary inputs as well, not just 32-bit integers. I don't know if this is the exact same as Greg answer, but I couldn't get at what he meant. But what I do know is that the memory requirements are constant here. No matter how big the input, this is still a reliable obfuscation/encryption trick.

For the record, this method is not hashing, and it does not have collisions. It's a perfectly sound method of obfuscating a byte string.

What you need for this to work is a secret key _encryptionTable which is a random permutation of the inclusive range 0..255. You use this to shuffle bytes around. To make it really hard to reverse it uses XOR to mix the byte string a bit.

public byte[] Encrypt(byte[] plaintext)
{
    if (plaintext == null)
    {
        throw new ArgumentNullException("plaintext");
    }
    byte[] ciphertext = new byte[plaintext.Length];
    int c = 0;
    for (int i = 0; i < plaintext.Length; i++)
    {
        c = _encryptionTable[plaintext[i] ^ c];
        ciphertext[i] = (byte)c;
    }
    return ciphertext;
}

You can then use the BitConverter to go between values and byte arrays or some convert to base 64 or 32 to get a textual representation. Base 32 encoding can be URL friendly if that's important. Decrypting is as simply as reversing the operation by computing the inverse of the _encryptionTable.

    public byte[] Decrypt(byte[] ciphertext)
    {
        if (ciphertext == null)
        {
            throw new ArgumentNullException("ciphertext");
        }
        byte[] plaintext = new byte[ciphertext.Length];
        int c = 0;
        for (int i = 0; i < ciphertext.Length; i++)
        {
            plaintext[i] = (byte)(_decryptionTable[ciphertext[i]] ^ c);
            c = ciphertext[i];
        }
        return plaintext;
    }

You can also do other fun things if you're working on a 32-bit integer and only care about the numbers greater than or equal to 0 which makes it harder to guess an obfuscated number.

I also use a secret word to seed a pseudo number generator and use that to setup the initial permutation. That's why I can simply get the value by knowing what secret word I used to create every thing.

var mt = new MersenneTwister(secretKey.ToUpperInvariant());
var mr = new byte[256];
for (int i = 0; i < 256; i++)
{
    mr[i] = (byte)i;
}
var encryptionTable = mt.NextPermutation(mr);
var decryptionTable = new byte[256];
for (int i = 0; i < 256; i++)
{
    decryptionTable[encryptionTable[i]] = (byte)i;
}
this._encryptionTable = encryptionTable;
this._decryptionTable = decryptionTable;

This is somewhat secure, the biggest flaw here is that the encryption, XOR with 0, happens to be the identity of XOR and doesn't change the value (a ^ 0 == a). Thus the first encrypted byte represent the random position of that byte. To work around this you can pick a initial value for c, that is not constant, based of the secret key by just asking the PRNG (after init with seed) for a random byte. That way it's immensely more difficult even with a large sample to crack the encryption as long as you can't observe input and output.

John Leidegren 2010-04-02 09:08:44

This worked beautifully. Thanks!

Chris 2010-04-02 19:37:18

Answer 4

A:

You could play with the bitpatterns of the number - eg rotates and swaps on the bits. That will give you a way to move between a number of say 26 bits and another number of 26 bits that won't be immediately obvious to a human observer. Though its by no means "secure".

Ben Clifford 2010-04-02 09:14:38

I tried messing around with this, too, but couldn't come up with a way of making the results non-sequential.

Chris 2010-04-02 19:40:40

Answer 5

+2 A:

XOR is a nice and fast way of obfuscating integers:

1 xor 1234 = 1235
2 xor 1234 = 1232
3 xor 1234 = 1233
100 xor 1234 = 1206
120 xor 1234 = 1194

It's fast, and xor-ing again with the same number gives you back the original! The only trouble is, if an "attacker" knows any of the numbers, they can trivially figure out the xor mask... by xor-ing the result with the known original!

For example I (the "attacker") now that the 4th number in that list is an obfuscated "100". So I'll do:

100 xor 1206 = 1234

... and now I've got the XOR mask and I can un-obfuscated any of the numbers. Happily there are trivial solution to that problem. Algoritmically alter the XOR mask. For example, if you need to obfuscate 1000 integers in an array, start with a XOR mask of "1234" and do increment the MASK with 4 for each number in the arrray.

Cosmin Prund 2010-04-02 09:18:52

I thought about this idea, but threw it out because 1235 (for 1) and 1232 (for 2) in your example are pretty close to one another. Thanks anyway!

Chris 2010-04-02 19:38:20

Answer 6

A:

In case other people are interested, somebody adapted a 32-bit block cipher a few years back that's especially useful for this task.

http://www.qualcomm.com.au/PublicationsDocs/skip32.c

There is also Perl and Ruby port of the above available.

If you need the result in 8 characters or less, you can use a hex or base64 representation.

jingoro 2010-06-13 06:07:43

ansaurus

tags:

views:

answers:

Integer ID obfuscation techniques

related questions