views:

97

answers:

3

I want to encrypt/decrypt lots of small (2-10kB) pieces of data. The performance is ok for now: On a Core2Duo, I get about 90 MBytes/s AES256 (when using 2 threads). But I may need to improve that in the future - or at least reduce the impact on the CPU.

  • Is it possible to use dedicated AES encryption hardware with Java (using JCE, or maybe a different API)?
  • Would Java take advantage of special CPU features (SSE5?!), if I get a better CPU?
  • Or are there faster JCE providers? (I tried SunJCE and BouncyCastle - no big difference.)
  • Other possiblilities?
+3  A: 

The JVM will not, by itself, take advantage of special CPU features when executing code which happens to be an AES encryption: recognizing some code as being an implementation of AES is beyond the abilities of the JIT compiler. To use special hardware (e.g. the "Padlock" on VIA processors, or the AES-NI instructions on the newer Intel processors), you must go, at some point, through "native code".

Possibly, a JCE provider could do that for you. I am not aware of any readily available JCE provider which includes optimized native code for AES (there was a project called Apache JuiCE, but it seems to be stalled and I do not know its status). However it is conceivable that SunJCE will do that in a future version (but with Oracle buying Sun and the overfeaturism of OpenJDK 7, it is unclear when the next Java version will be released). Alternatively, bite the bullet and use native code yourself. Native code is invoked through JNI, and for the native AES code, a popular implementation is the one from Brian Gladman. When you get a bigger and newer processor with the AES-NI instruction, replace that native code with some code which knows about these instructions, as Intel describes.

By using AES-128 instead of AES-256 you should get a +40% speed boost. Breaking AES-128 is currently beyond the technological reach of Mankind, and should stay so for the next few decades. Do you really need a 256-bit key for AES ?

Thomas Pornin
Very good answer, contains all I needed to know! Thanks
Chris Lercher
Be very cautious about implementing any cipher yourself. It is extremely difficult to do it wrong or to leak key material to bad guys. (Timing attacks, branch prediction attacks, you name it). Generally you should *always* leave cryptographic implementations to the experts.
Eadwacer
+1  A: 

I would also suggest using AES-128 rather than 256. If the code is loosely coupled, and is still around in however many years it takes for AES-128 to become archaic, my guess is that it will be much easier to update the encryption at that point (when hardware will be more powerful) than it would be to try to optimize performance via hardware now.

Of course, that is assuming it is loosely coupled :D

badpanda
When I tried AES-128, it didn't provide the amount of performance gain I had hoped for (about +20% throughput). But it's still worth to consider it, given that it's safe enough.
Chris Lercher
Ironically, AES-256 is really not more secure than AES-128, because they both offer the same amount of protection against a keysearch attack. Keysearch with either is infeasable for the foreseeable future until quantum cryptography becomes a reality, in which case AES-256 might still be secure, but AES-128 might be broken...
vy32
+2  A: 

A simple google search will identify some JCE providers which claim hardware acceleration Solaris Crypto Framework. I have heard the break-even point is 4K (where under 4k its faster to perform using in JVM java providers).

I might look at using the NSS implementation, it might have some compiler optimizations for your platform (and you can certainly build from source with them enabled); though I have not used it myself. The big benefit with hardware a provider is probably the fact that the keys can be stored in hardware in a way that supports using them without exposing them to the OS.

Update: I should probably mention that the Keyczar source had some helpful insight (somewhere in source or surrounding docs) about reducing the overhead for initializing the Cipher. It also does exactly what you want (see Encrypter), and seems to implement asynchronous encryption (using a thread pool).

Justin
The Solaris Crypto framework was the only one, I had found - and I simply assumed, I can only use that with Solaris - or could that also be used with Linux (x64)? And maybe a naive question: Can I buy an UltraSparc processor on a card or would I have to use an UltraSparc server?
Chris Lercher
On Windows you would use the Window Crypto Services (MSCAPI or CNG) (its the analogy on windows, which actually came first). I don't think you would see any speedup going to an ultrasparc on a card. You would be better off buying a bigger intel machine and running a thread pool.
Justin