views:

96

answers:

2

I have to verify the signature on a file that may be as large as 2Gb, and I want to do so in a way that is as memory-efficient as possible. For various reasons, the file will already be loaded completely into memory, and is accessed using an InputStream by the application. I would like to verify the signature using the stream interface, but the JCA Signature class' update method only accepts byte[] and related classes.

How can I do this efficiently? I don't want to load the beast into a second byte array, otherwise we'll be seeing some seriously high memory use, but the interface doesn't seem to support it otherwise.

Update

If it matters, the signing algorithm is SHA-1

+2  A: 

Why not just read the input stream a block (4096bytes or whatever convenient size) at a time, call update() for each block.

Jason S
I like the sound of it; does `update()` flush the previously updated blocks?
Chris R
I would assume that it's meant to update the signature iteratively for the given input, and at the end you call some "finish()" method... do you have a link to a page that describes the Signature class? I couldn't find it other than some vague description.
Jason S
+1  A: 

Create a byte array to act as a buffer and read buffer at a time from the InputStream, calling update() on the Signature each time. Provided the buffer is of a reasonable size, the CPU time consumed transferring the data from one process to another (I'm guessing that's what you're doing?) is likely to be negligible compared to the calculation time. In the case of reading from disk, the cut-off point for negligible return on CPU usage appears to be a buffer size of around 8K, and I suspect that this will more or less apply in your case too. (In case it's interesting, see the page I put together on InputStream buffer sizes.)

Neil Coffey