views:

764

answers:

6

I want to store a byte array wrapped in a String object. Here's the scenario

  1. The user enters a password.
  2. The bytes of that password are obtained using the getBytes() String method.
  3. They bytes are encrypted using java's crypo package.
  4. Those bytes are then converted into a String using the constructor new String(bytes[])
  5. That String is stored, or otherwise passed around (NOT changed)
  6. The bytes of that String are obtained and they are different then the encoded bytes.

Here's a snippet of code that describes what I'm talking about.

String s = "test123";
byte[] a = s.getBytes();
byte[] b = env.encrypt(a);
String t = new String(b);
byte[] c = t.getBytes();
byte[] d = env.decrypt(c);

Where env.encrypt() and env.decrypt() do the encryption and decryption. The problem I'm having is that the b array is of length 8 and the c array is of length 16. I would think that they would be equal. What's going on here? I tried to modify the code as below

String s = "test123";
Charset charset = Charset.getDefaultCharset();
byte[] a = s.getBytes(charset);
byte[] b = env.encrypt(a);
String t = new String(b, charset);
byte[] c = t.getBytes(charset);
byte[] d = env.decrypt(c);

but that didn't help.

Any ideas?

+7  A: 

It's not a good idea to store binary data in a String object. You'd be better off using something like Base64 encoding, which is intended to make binary data into a printable string, and is completely reversible.

In fact, I just found a public domain base64 encoder for Java: http://iharder.sourceforge.net/current/java/base64/

Jonathan
+1 take the password, encrypt it, convert to base64 string (suggest using Apache Commons Codec for the last bit).
skaffman
It's also not a good idea to store secrets in String objects (the password input or the decrypted output) unless you have absolutely no choice. This is because there's no way to clear a String - once it's in memory, a String doesn't get overwritten until the memory is garbage collected AND the memory allocator decides to reallocate that section of memory.
atk
A: 

I don't have a definitive answer for you, but if I were working on this, I'd print out the string or byte at each step and compare them to see what's happening. Also, b holds a return value from env.encrypt, but c is a return value from .getBytes, so in a way you're comparing apples to oranges in that case.

Chris
+3  A: 

This is somewhat of an abuse of the String(byte[]) constructor and related methods.

This would work with certain encodings, and fail with others. Presumably your platform's default encoding is one of the ones where it fails.

You should use something like Commons Codec to convert these bytes to hex or base64.

Also why are you encrypting passwords instead of hashing them with salt anyway?

Licky Lindsay
+3  A: 

In both cases, you are using the OS default non-Unicode charset (which depends on locale). If you're passing the string from one system to another, they may have different locales, and thus different default charsets. You need to use one well-defined charset to do what you're trying to do; e.g. ISO-8859-1.

Better yet, don't do the conversion, and pass the byte[] array directly.

Pavel Minaev
+2  A: 

This isn't going to work properly. Storing a byte as a string is only going to work right for the ascii set (and a few others). If you NEED to store the encrypted result as a String, then what about converting the bytes to hex and then putting that in a String. That would work.

I recommend you just keep the password as bytes. There's no real reason to store it as a String (unless you want to see what peoples passwords are).

Nick
+2  A: 

Several people have pointed out that this is not a proper use of the String(byte[]) constructor. It is important to remember that in Java a String is made up of characters, which happen to be 16 bits, and not 8 bits, as a byte is. You are also forgetting about character encoding. Remember, a character is often not a byte.

Let's break it down bit by bit:

String s = "test123";
byte[] a = s.getBytes();

At this point your byte array most likely contains 8 bytes if your system's default character encoding is Windows-1252 or iso-8859-1 or UTF-8.

byte[] b = env.encrypt(a);

Now b contains some seemingly random data depending on your encryption, and isn't even guaranteed to be a certain length. Many encryption engines pad the input data so that the output matches a certain block size.

String t = new String(b);

This is taking your random bytes and asking Java to interpret them as character data. These characters may appear as gibberish and some sequences of bits are not valid characters for every encoding. Java dutifully does its best and creates a sequence of 16-bit chars.

byte[] c = t.getBytes();

This may or may not give you the same byte array as b, depending on the encoding. You state in the problem description that you are seeing c as 16 bytes long; this is probably because the garbage in t doesn't convert well in the default character encoding.

byte[] d = env.decrypt(c);

This won't work because c is not the data you expect it to be but rather is corrupt.

Solutions:

  1. Just store the byte array directly in the database or wherever. However you are still forgetting about the character encoding problem, more on that in a sec.
  2. Take the byte array data and encode it using Base64 or as hexadecimal digits and store that string:

    byte[] cypherBytes = env.encrypt(getBytes(plainText)); StringBuffer cypherText = new StringBuffer(cypherBytes.length * 2); for (byte b : cypherBytes) { String hex = String.format("%02X", b); //$NON-NLS-1$ cypherText.append(hex); } return cypherText.toString();

Character encoding:

A user's password may not be ASCII and thus your system is susceptible to problems because you don't specify the encoding.

Compare:

String s = "tést123";
byte[] a = s.getBytes();
byte[] b = env.encrypt(a);

with

String s = "tést123";
byte[] a = s.getBytes("UTF-8");
byte[] b = env.encrypt(a);

The byte array a won't have the same value with the UTF-8 encoding as with the system default (unless your system default is UTF-8). It doesn't matter what encoding you use as long as A) you're consistent and B) your encoding can represent all the allowable characters for your data. You probably can't store Chinese text in the system default encoding. If your application is ever deployed on more than one computer, and one of those has a different system-default encoding, passwords encrypted on one system will become gibberish on the other system.

Moral of the story: Characters are not bytes and bytes are not characters. You have to remember which you are dealing with and how to convert back and forth between them.

Mr. Shiny and New