views:

219

answers:

2

I am writing a StringOutputStream class in Java because I need the pre-Base64-encoded data from an OutputStream to go to a String. I need this as a String because I am going to be putting it into a W3C XML Document.

Everything would be fine, but I'm dealing with (relatively) large objects. The resulting object turns out to be about 25 MB (before String representation). I am running this as an Applet, so I have 66 MB of heap space which gets exhausted quite quickly.

I have tried a few methods so far:

  1. Append the received byte to a String object (using strObj.concat((byte) b) and strObj += new String((byte) b)) with and without buffering
  2. Add the received byte to a StringBuffer
  3. Add the byte to a byte array, then when the string is wanted, convert that byte array to a String

Number one works until about 11 MB, when the old String and the new String use up too much space when concat-ing.

Number two was a total failure, it only gets to about 7 MB.

Number three was (perhaps?) the best, it stores the whole stream, but when trying to get the String it, unsurprisingly, fails.

How would I make this work? Is it possible?

I think I have the space available to hold the resulting String, but it's the copying that is the problem (since you need source and destination for a traditional copy). I know Strings are immutable, but is there any way to append some characters onto the end?

Here's my three examples:

package com.myorg.SigningServer.Util.Security;

import java.io.IOException;
import java.io.OutputStream;
import java.util.Arrays;

import com.technicolor.SigningServer.Applet.SigningApplet;

public class StringOutputStream extends OutputStream {

byte[] array = new byte[1024*1024*22];
StringBuffer sb = new StringBuffer();
String output = "";
int prevByte = -1;
long numBytes = 0;

int bufferPos = 0;
int bufferSize = 512*1024;
byte[] buffer = new byte[bufferSize];

public void write2(int b) throws IOException {
    sb.append((byte) b);
}


public void write3(int b) throws IOException {
    array[(int) numBytes] = (byte) b;
    numBytes++;
}

public void write1(int b) throws IOException {
    numBytes++;
    bufferPos++;
    buffer[bufferPos] = (byte) b;
    if(bufferPos == bufferSize-1)
    {
        bufferPos = 0;
        System.gc();
        System.out.println("Generating string "+numBytes+"; String length "+output.length());
        output = output.concat(new String(buffer));
        System.gc();
    }
}

public void flush1() {
    output = output.concat(new String(Arrays.copyOf(buffer, bufferPos)));
    bufferPos = 0;
    System.gc();
}

public String toString2()
{
    return sb.toString();
}

public String toString3()
{
    return new String(array);
}

public String toString1()
{
    return output;
}
}

A few notes about the code: obviously, you rename the methods you want to use to write() and toString(). Also, the byte array is (currently) statically allocated, but that would be changed if I go that route (and is not used during the other methods).

Edit 1: More information on my overall problem:

This is part of a larger application that takes data, signs it, and uploads it to a server. I have to read in a file, take the SHA-1 hash of it, encrypt it, and then construct an XML document (with a few other things in it, such as the time). Then that XML document must be signed (via XML DSig, aka javax.xml.crypto.dsig.XMLSignatureFactory) and uploaded back to the server.

The files to be signed are anywhere from 1KB to about 50 MB.

There are a few problems:

  1. The current Java implementation of XML DSig does not parse and XML streams, just w3c Nodes. (I also cannot find any other implementations that do)
  2. My boss wants this to not require minimal client-side installation, so that's why an Applet was chosen (it is a signed applet, so it can access anything on the client).
+3  A: 

As you're putting the resulting String in an XML document, I suggest using a streaming XML api. Then the entire process can be streamed and you do not need to keep large amounts of data in memory.

What is happening to the XML document? Being an applet, I can imagine only a few alternatives - write to a file in the sandbox, or stream back to the origin server. If you use streaming XML then this can be done as the data is sent to it's final location as you write data through your your stream.

For example, you can stream character data to a SAX ContentHandler in your StringOutputStream, rather than storing the data in a buffer.

EDIT 1:

Given the max file size of 50MB, I think you are pushing applets a little too far, unless you can guarantee they are configured with memory ca 3-4x your max file size (e.g. using the java control panel plugin on Windows.) Signing in the applet isn't very secure - it's easy to reverse engineer and get the private key rendering the signing untrustworthy. If the applet always uses the same key, can you not move the signing server side? The file is being uploaded anyway, and this will avoid all the memory problems. The scheme is then:

  • the applet uploads the original file to the server
  • the server creates the XML from the file
  • the server signs the XML
  • the sever forwards the signed XML to whever the applet was sending it. If it was to one of your own servers/webapps, then the file is already available for use.
mdma
I'm still looking into using a streaming XML API, but the problem is that I need to sign the XML using an XMLSignatureFactory, and I haven't found if I can do that with a streaming XML API.
HalfBrian
Is using Java Web Start and option? You can specify the amount of memory needed at VM startup.
mdma
Unfortunately it is not because my boss wants this to not require any down loadable applications. If I can't solve this, obviously that would be the restriction to be discarded, and then I would probably run it as a desktop application.
HalfBrian
If your String doesn't fit in memory, i can guarantee the DOM won't either. DOM is _big_.
james
Is this to be widely deployed or run on just a few clients? You can increase the memory available to an applet in the java plugin in control panel.
mdma
In response to Edit 1, the reason I don't upload the file is because the file is signed with the **user's** private key, not the **applet's**.And I agree, it is indeed pushing the limits of an applet, which is why I could not figure it out on my own.You've been extremely helpful mdma, but I went another way with things, as I described in my answer.I'd give you credit for the answer since you were so helpful, but it wouldn't be the correct answer.
HalfBrian
Thanks for the clarification. I had a hunch that you might be using the user's certificate. It's a pitty String can't be subclassed. The only other option I can think of is to split the data into chunks - e.g. as separate tags in the XML file, so that the strings don't become so large. But that may not be possible if you have a rigid XML schema to stick to.
mdma
A: 

Thanks to mdma, I realized that I would truly need a way to stream this, rather than storing in in memory.

Here's what I did: the applet now encrypts the data as before, but signs it using PKCS7 (using BouncyCastle's CMSSignedDataStreamGenerator). The data is streamed across the network without storing it on the client's computer.

The detached PKCS7 signature generated by the PKCS7 signing is then put inside the XML. That XML is then signed with XML DSig and separately uploaded to the server.

HalfBrian