views:

81

answers:

4

This is a newbie question, I know. Can you guys help?

I'm talking about big files, of course, above 100MB. I'm imagining some kind of loop, but I don't know what to use. Chunked stream?

One thins is for certain: I don't want something like this (pseudocode):

File file = new File(existing_file_path);
byte[] theWholeFile = new byte[file.length()]; //this allocates the whole thing into memory

File out = new File(new_file_path);
out.write(theWholeFile);

To be more specific, I have to re-write a applet that downloads a base64 encoded file and decodes it to the "normal" file. Because it's made with byte arrays, it holds twice the file size in memory: one base64 encoded and the other one decoded. My question is not about base64. It's about saving memory.

Can you point me in the right direction? Thanks!

+2  A: 

Perhaps a FileInputStream on the file, reading off fixed length chunks, doing your transformation and writing them to a FileOutputStream?

Brabster
A: 

Perhaps a BufferedReader? Javadoc: http://download-llnw.oracle.com/javase/1.4.2/docs/api/java/io/BufferedReader.html

CD Sanchez
+4  A: 

From the question, it appears that you are reading the base64 encoded contents of a file into an array, decoding it into another array before finally saving it.

This is a bit of an overhead when considering memory. Especially given the fact that Base64 encoding is in use. It can be made a bit more efficient by:

  • Reading the contents of the file using a FileInputStream, preferably decorated with a BufferedInputStream.
  • Decoding on the fly. Base64 encoded characters can be read in groups of 4 characters, to be decoded on the fly.
  • Writing the output to the file, using a FileOutputStream, again preferably decorated with a BufferedOutputStream. This write operation can also be done after every single decode operation.

The buffering of read and write operations is done to prevent frequent IO access. You could use a buffer size that is appropriate to your application's load; usually the buffer size is chosen to be some power of two, because such a number does not have an "impedance mismatch" with the physical disk buffer.

Vineet Reynolds
A: 

Use this base64 encoder/decoder, which will wrap your file input stream and handle the decoding on the fly:

InputStream input = new Base64.InputStream(new FileInputStream("in.txt"));
OutputStream output = new FileOutputStream("out.txt");

try {
    byte[] buffer = new byte[1024];
    int readOffset = 0;
    while(input.available() > 0) {
        int bytesRead = input.read(buffer, readOffset, buffer.length);
        readOffset += bytesRead;
        output.write(buffer, 0, bytesRead);
    }
} finally {
    input.close();
    output.close();
}
James Van Huis
what library is this?
daigorocub
There is no library, this is just a Public Domain class you can include in your project and modify it to fit your purposes. Download at http://iharder.sourceforge.net/current/java/base64/
James Van Huis