views:

2218

answers:

11

I have already read

http://stackoverflow.com/questions/321787/using-java-to-encrypt-integers http://www.exampledepot.com/egs/javax.crypto/PassKey.html

All I need is a simple Encrypter which transforms a 12 digit number to a 12 digit number with the following constraints

  1. The encryption must depend on a password (which will be constant throughout the life time of an application) and nothing else.
  2. The mapping must be 1-1 (No hashing and multiple inputs giving same output and vice versa)
  3. The mapping must not change between different VMs or when VM is created (like when you restart Java, the utility should give you same mappings which means that it must be purely dependent on the password that is supplied)
  4. Numbers starting with 0 is not a valid 12 digit number (even input numbers wont start with 0)

Having trawled through the literature I have this code with me

public void mytestSimple(long code, String password) throws Exception {
 SecretKey key = new SecretKeySpec(password.getBytes(), "DES");
 Cipher ecipher = Cipher.getInstance("DES");
 ecipher.init(Cipher.ENCRYPT_MODE, key);
 System.out.println(ecipher.getOutputSize(8));

 byte[] encrypted = ecipher.doFinal(numberToBytes(code));
 System.out.println(encrypted + "--" + encrypted.length);

 Cipher dcipher = Cipher.getInstance("DES");
 dcipher.init(Cipher.DECRYPT_MODE, key);
 byte[] decrypted = dcipher.doFinal(encrypted);
 System.out.println(bytesToNumber(decrypted) + "--" + decrypted.length);
}

public void testSimple() throws Exception {
 mytestSimple(981762654986L, "password");
}

I am running into problems as to

  1. How to convert the 16 bytes into a 12 digit number
  2. Maintain 1-1 mapping
  3. Keep the encryption/decryption same across multiple VM invocations

** MORE INFORMATION***** Added after asking

  1. The key/password should never be guessable. For example running the utility with multiple inputs and analysing the outputs should not allow one to guess the key/pwd/hash or whatever.
  2. All inputs will be exactly 12 digits and less than a 12 digit prime number (which means we could use modulo arithmetic).

** Answer added by me below** I have added one answer which is a 40bit RSA pulled out of standard Java RSA keypair gen logic. I still have to work on the edge cases. I am going to accept the answer and upvote "Tadmas" who I think kinda lead me to the answer. Can someone tell me if my algo is going to be weak/attackable. Please?

+6  A: 

You're not going to be able to convert 16 bytes into a 12 digit number without losing information. 256 ^ 16 > 10^12. (Not that you even have 10^12 options, as you've only got the range [100000000000, 999999999999].

I doubt that you'll be able to use any traditional encryption libraries, as your requirements are somewhat odd.

Jon Skeet
Yep. To make it secure, you need something different each time (like a different key or a different initialization vector). The encrypted message has to carry that information. But if you have 12 digits of data, and only 12 digits of message, you don't have any room left for the "extra".
erickson
+2  A: 

I suggest a very simple algorithm.

  1. Feed the password into a hash function.
  2. Initialize a random number generator with the hash or something you derived from the hash.
  3. Generate a 12 digit random number.
  4. Add this random number to the input digit by digit modulo 10 to encrypt.

To decrypt subtract the random number modulo 10. This is actually a form of One Time Pad. Because of the comments on this answer I realized that refering to One Time Pad was a bad choice. A better reference is Polyalphabetic cipher - while One Time Pad uses polyalphabetic substitution its main characteristic is not to use any key bit twice.

   Input           1234 1234 1234
   Random number   6710 3987 2154
   Output          7944 4111 3388

There is one remaining problem with that - the algorithm might create leading zeros. To solve this problem one could use XOR instead of addition and substraction. Just transform the digits with XOR. If the first digit turns into a zero, don't encrypt the first digit. When you decrypt with XOR again, the first digit will turn into zero and you know that the first digit was not enrcypted.

UPDATE

A simple XOR is not the solution because it will produce to large numbers - 2 XOR 9 = 11 for example. Going to rethinks this...

UPDATE

The nice propoerties of XOR are XOR(a, b) = XOR(b, a) and XOR(XOR(a, b), b) = a. This makes encryption and decryption the same and allows to detect the unencrypted leading digit. But it is further required that our function only returns values in the range from 0 to 9 what XOR doesn't do.
But maybe we can build a custom function with all required properties. So we create an array FUNC with 10 columns and 10 rows and use it as a lookup table for our function. What values to but in? I actually don't know - I am not even sure that it is possible. But if we pick three number from the range 0 to 9 we have to make the following six entries. (It is a symmetric matrix.)

FUNC[x,y] = z   FUNC[x,z] = y   FUNC[y,z] = x
FUNC[y,x] = z   FUNC[z,x] = y   FUNC[z,y] = x

So maybe it is possible to create such a function by repeatedly choosing random numbers and filling the six entries if there is no conflict. Maybe it is not. I would like to see the table if one finds a solution.

Daniel Brückner
Drat, you beat me to it. I don't think the RNG is necessary, though.
Tadmas
Except... you never want to reuse a one time pad (that would be the one time part). Given the encrypted numbers (A + K) and (B + K) I can take (A + K) - (B + K) and get (A - B). That doesn't look that scary because the values X0 - Xn still seem random, except for the fact that each digit it defines a line Ai - Bi = Xi which substantially reduces the search space and makes your encrypted numbers much more susceptible to attack particularly if I have more than one encrypted number or the numbers have any recognizable format.
Aaron Maenpaa
The problem is when the algorithm is run multiple times, the key becomes easily "guessable". I apologise I did not add a constraint that "There must not be a pattern between the numbers". My 12 digit output is going to be presented to a user as a identifier. So a user who can do like 30-40 registrations should not be able to guess what the applications key is :( and thereby guess a valid output
Calm Storm
You could strengthen the algorithm by doing several rounds with successive numbers from the random number generator.
Daniel Brückner
If you use the password as the seed to a RNG, it's not a one time pad. You'll always get the same output; that's the whole point of having a seed on the RNG. So always subtracting the same number (password through the RNG) is exactly as secure as subtracting the same number (password only).
Tadmas
@Tadman I am aware of this, but you are right. I should not have mentioned One Time Pad but Caesar cipher or even better polyalphabetic cipher (because Caesar is monoalphabetic) instead. So the reference to One Time Pad was a bad choice and I am going to update this.
Daniel Brückner
Voted down for designing your own cipher. That is an absolute no-no in cryptography.
Accipitridae
A: 

For mathematical reasons, most cyphers will produce "more" bytes (i.e. they will pad the input). So you will have to accept that the code generates 16bytes out of your 12 digit number.

When the string is decoded, you will get the 12 digit number back but during transport, you need 16 bytes.

Aaron Digulla
You're talking about *block ciphers*. *Stream ciphers* don't have that problem.
Loadmaster
+4  A: 

If the strict 1:1 mapping is more important than protecting against cryptanalysis, then you can convert the password to a 12-digit number (via hash or otherwise) and simply add to your original number mod 10^12. If you absolutely must remove leading zeros from the output, you can subtract 10^11, do the math mod (10^12 - 10^11), and then add 10^11 back again. Granted, that's extremely insecure, but it's quite simple. :)

If the range of inputs is bounded by a prime less than (10^12 - 10^11), you may be able to use message ^ password mod prime to form a ring that will satisfy your requirements and be a little harder to crack. (This is similar to how RSA works.) I think this could work if you don't need to decrypt it.

I agree with Jon Skeet: requiring a strict 1:1 mapping without the output range being bigger than the input domain is something that most encryption libraries are not going to handle.

Tadmas
Tadmas thanks for your idea on the modulo thing with RSA, I am trying something similar. Basically a password and a 12 digit prime number to do RSA. Also it is not a constraint that it has to be converted to 16 bytes or any number of bytes during encryption. I just need a 12 digit number to a 12 digit number mapping.I will start cracking on this thing and post you on how it goes (Any code examples you already have will be appreciated)
Calm Storm
"I think this could work if you don't need to decrypt it" - Isn't that not true? Lets assume I can find a 12 digit prime number and all my inputs are going to be less than that, I should be able to get a 1-1 mapping and also be able to decrypt it innit?
Calm Storm
Ability to decrypt != easy to decrypt. For a fixed password or a generated password (a la RSA), it's fairly easy. It's been a while since I've done the math, but this seems like a harder problem to solve: you have to find the inverse of the password yourself.
Tadmas
This is trivial to break. Off the shelf software (e.g. Mathematica) have absolutely not problem factoring 12 digit integers or computing discrete logarithms of 12 digit integers.
Accipitridae
A: 

If I understand your problem correctly you have a problem with the 16 bytes after decrypting. The trick is to use: Cipher.getInstance("DES/ECB/PKCS5Padding");

(just in case code is helpful):

public void mytestSimple(long code, String password) throws Exception 
       { SecretKey key       = new SecretKeySpec (password.getBytes(),"DES");
         Cipher    ecipher   = Cipher.getInstance("DES/ECB/PKCS5Padding");
         byte[]    plaintext = new byte[8];

         for (int i=0; i<8; i++)
             { plaintext[7-i] = (byte) (code & 0x00FF);
                >>>= 8;
             }

         ecipher.init      (Cipher.ENCRYPT_MODE, key);
         System.out.println(ecipher.getOutputSize(8));

         byte[] encrypted = ecipher.doFinal(plaintext);
         System.out.println("--" + encrypted.length);

         Cipher dcipher = Cipher.getInstance("DES/ECB/PKCS5Padding");
         dcipher.init(Cipher.DECRYPT_MODE, key);

         byte[] crypttext = dcipher.doFinal(encrypted);
         long   decoded    = 0;

         for (int i=0; i<8; i++)
             { decoded <<= 8;
               decoded  += crypttext[i] & 0x00FF;
             }

         System.out.println(decode + "--" + crypttext.length);
       }
tonys
+2  A: 

If the numbers are for user IDs, this is what I'd do:

(1) Generate an AES key from the password. Just calling getBytes() is sort of OK if you trust the administrator to use a really really really strong password. Ideally, use the standard "password-based encryption" technique of hashing the bytes, say, a few thousand times, each time adding in the random "salt" bytes that you initially generated to avoid dictionary attacks.

(2) Encrypt the number in question with that AES key.

(3) Chop off 12 digits' worth of bits from the resulting encrypted block, convert it to decimal, and present that number to the user. (To do this, you can wrap a BigInteger around the bytes, call toString() on it, and pull off, say, the bytes between position 4 and 16.) Experimentally, it looks like you shouldn't take the digits from the rightmost end.

[Update: I think this is probably because BigInteger literally allocates its numbers from left to rightmost bit-- but I haven't checked-- so there'll potentially be "spare" bits in the very rightmost byte, and hence fewer possible numbers if you include the very last byte.]

Now, I hear you cry, this obviously isn't a 1-1 mapping. But unless you're going to have more than tens of thousands of users, it's really good enough. With a 12-digit number, you'd expect on average to encrypt around 300,000 numbers before getting a collision. So although you don't strictly have a 1-1 mapping, in practice, it's as near as dammit.

(In any case, if your application really has hundreds of thoudands of users and security is crucial, then you'll probably want to invest in some serious consulting over this kind of thing...)

Just to convince yourself that it really is OK to pretend it's a 1-1 mapping, you can run a simulation that repeatedly tries to allocate, say, 200,000 user IDs with random keys, and prints out how many collisions there were on each run:

 next_pass :
        for (int pass = 0; pass < 100; pass++) {
          byte[] key = new byte[16];
          (new SecureRandom()).nextBytes(key);
          Cipher ciph = Cipher.getInstance("AES");
          SecretKeySpec ks = new SecretKeySpec(key, "AES");
          ByteBuffer bb = ByteBuffer.allocate(16);
          Set<String> already = new HashSet<String>(100000);
          int colls = 0;
          for (int i = 0; i < 200000; i++) {
            bb.putLong(0, i);
            ciph.init(Cipher.ENCRYPT_MODE, ks);
            byte[] encr = ciph.doFinal(bb.array());
            encr[0] &= 0x7f; // make all numbers positive
            BigInteger bigint = new BigInteger(encr);
            String userNo = bigint.toString();
            userNo = userNo.substring(4, 16);
            if (!already.add(userNo)) {
              System.out.println("Coll after " + i);
              continue next_pass;
            }
          }
          System.out.println("No collision.");
        }
Neil Coffey
A: 

Rethinking the problem I came up with the following. Basicly you need a symmetric cipher to get a one-to-one mapping. And noting that 10^12 is almost 2^40 (2^39.863) it seems natural to convert your 12 digit number into a 40 bit integer and feed this number into a block cipher with a block length of 40 bits. A good choice might be Blowfish supporting block lengths from 32 to 448 bits in steps of 8 bits - so 40 bits is supported.


UPDATE

As Accipitridae pointed out, Blowfish has variable key size but fixed block size hence it is no option. A bit more searching through the web seems to indicate, that there are little or no ciphers with block sizes of 40 bits or less rendering this idea void. A leave the remaining part of the answer - maybe one can find a suitable cipher.


The remaining problem is that the Blowfish might return a number up to 1,099,511,627,775 with 13 digits and that the returned number might contain leading zeros, but I believe that this can be solved in a second step. My first thought was applying something like a Burrows-Wheeler transform to the string representation of the number in order to get at least one zero to the front of the number eleminating the 13th digit and then modify all remaining digits (for example 0 <-> 9, 1 <-> 8, 2 <-> 7, ...) to turn additional leading zeros into other digits.
After a few minutes I regonized that this will not work - the input has size 2^40 while the output is only of size 2^39.863. A solution would be to use a 39 bits block cipher and restrict the input to numbers up to 549,755,813,887. I don't know if there is a cipher that can deal with a block length of 39 bits, but this paper on Elastic Block Ciphers describes how to construct a block cipher that can deal with every block size from n up to 2n given a block cipher that can handle block size n. In consequence it is possible to construct a 39 bit block cipher from 32 bit Blowfish.

Daniel Brückner
Blowfish has a fixed block size and variable key size. 40 bit block size is not supported.
Accipitridae
Thanks. Read the wrong line on wikipedia - key size instead of block size.
Daniel Brückner
+2  A: 

One potential solution could be built on Feistel ciphers. This constructions allows to build a pseudorandom permutation based on a pseudorandom functions. E.g. the pseudorandom functions could be constructed from an appropriate block cipher by truncating the result to a 6 digit numbers.

This construction has been analyzed in the following paper M. Luby and C. Rackoff, "How to construct pseudorandom permutations from pseudorandom functions" SIAM Journal on Computing, Vol.17, No.2, pp.373--386, 1988


A concrete proposal is the Feistel Finite Set Encryption Mode, which has been submitted to NIST for potential inclusion into an upcoming standard. This proposal also addresses the problem of encrypting ranges that are not a power of 2.

Accipitridae
A: 

Me thinks the answer given below by Tadmas was very helpful and I want you guys to hack/bully my implementation below. As Tadmas points out all my numbers are 40 bits (12 digit number is 10^12 which is 2^40 approx).

I copied the sun.security.rsa.RSAKeyPairGenerator (link) and created my own generator for a 40 bit RSA algorithm. The standard one needs between 512-1024 bits so I removed the input check around it. Once I create a suitable n, e, d values (e seems to be 65537 as per the alog). The following code served fine,

public void testSimple() throws NoSuchAlgorithmException {
 MyKeyPairGenerator x = new MyKeyPairGenerator();
 x.initialize(40, new SecureRandom("password".getBytes()));

 MyPublicPrivateKey keypair = x.generateKeyPair();
 System.out.println(keypair);

 BigInteger message = new BigInteger("167890871234");
 BigInteger encoded = message.modPow(keypair.e, keypair.n);
 System.out.println(encoded); //gives some encoded value
 BigInteger decoded = encoded.modPow(keypair.d, keypair.n);
 System.out.println(decoded); //gives back original value
}

Disadvantages

  1. The encoded may not always be 12 digits (sometimes it may start with 0 which means only 11 digits). I am thinking always pad 0 zeroes in the front and add some CHECKSUM digit at the start which might alleviate this problem. So a 13 digit always...
  2. A 40 bits RSA is weaker than 512 bit (not just 512/40 times but an exponential factor of times). Can you experts point me to links as to how secure is a 40bit RSA compared to 512 bit RSA (I can see some stuff in wiki but cannot concretely confirm possibility of attacks)? Any links (wiki?) on probabilities/number of attempts required to hack RSA as a function of N where n is the number of bits used will be great !
Calm Storm
40-bit RSA keys can be factored in less than a millisecond using rather simple algorithm such as Pollard Rho. Even trial division works with that size of moduli.Curiously this makes it harder to find the public modulus than to factor it. Even if the modulus and exponent is kept secret it should still be possible to break the scheme, (e.g. by using the multiplicative properties of RSA) though in that case the runtime of an attack would depend on how much plaintext/ciphertext an attacker can learn.
Accipitridae
I am not giving out my public key (like a normal RSA). I am just basing a p,q,e,d,n generation from a password. So I think just examining the outputs (without knowing the public key) it gets impossible to crack. innit?
Calm Storm
Since you are using a 40-bit RSA the factors will be about 20-bits long. If the attacker just knows one message m and the corresponding ciphertext c=m^e mod n. If p is a factor of n then c mod p = m^e mod p. Hence we can try a brute force search for p. Trying all primes p below 2^20 takes less than 100 ms and should at least find the smaller factor. Finding the other factor isn't much more difficult either. Calling this scheme impossible to crack is an overstatement.The thing to note is that RSA has many nice mathematical properties.They are good for PKC but bad for symmetric crypto.
Accipitridae
A: 

If you're prepared to accept a rather weak solution...

Treat the 12-digit number as two 6-digit numbers. Then use the hash of the password as a seed to a random number generator which shuffles a table of 999,990 consecutive values. Then use the two 6-digit numbers to look up entries in the table. The concatenation of the two results is your 1:1 mapped 12-digit 'encrypted' input based on a password.

Let's do an example with 4-digits instead of 12...

Input: 4852
Password: passw0rd1 => hashcode = 37592

Now take this array..

a = [10, 11, 12, 13, .. 98, 99]

And shuffle it with 37592 as the random seed...

b = [45, 15, 56, 49, .. 33, 88]

Now split the input: 48, 52 and look up those indices in the shuffled array, say...

b[48] => 23
b[52] => 96

So our encrypted version of 4852 is 2396

It really isn't a strong solution but the constraints in your question will not lead to a strong solution. You may need to relax those constraints a bit.

izb
+1  A: 

I would use a stream cipher. N bytes go in, N bytes come out.

Loadmaster