views:

235

answers:

3

Hi,

I'm using a servlet to do a multi fileupload (using apache commons fileupload to help). A portion of my code is posted below. My problem is that if I upload several files at once, the memory consumption of the app server jumps rather drastically. This is probably OK if it were only until the file upload is finished, but the app server seems to hang on to the memory and never return it to the OS. I'm worried that when I put this into production I'll end up getting an out of memory exception on the server. Any ideas on why this is happening? I'm thinking the server may have started a session and will give the memory back after it expires, but I'm not 100% positive.

if(ServletFileUpload.isMultipartContent(request)) {
   ServletFileUpload upload = new ServletFileUpload();
   FileItemIterator iter = upload.getItemIterator(request);
   while(iter.hasNext()) {
    FileItemStream license = iter.next();
    if(license.getFieldName().equals("upload_button") || license.getName().equals(""))
     continue;
    //DataInputStream stream = new DataInputStream(license.openStream());
    InputStream stream = license.openStream();
    ArrayList<Integer> byteArray = new ArrayList<Integer>();
    int tempByte;
    do {
     tempByte = stream.read();
     byteArray.add(tempByte);
    }while(tempByte != -1);
    stream.close();
    byteArray.remove(byteArray.size()-1);
    byte[] bytes = new byte[byteArray.size()];
    int i = 0;
    for(Integer tByte : byteArray) {
     bytes[i++] = tByte.byteValue();
    }

Thanks in advanced!!

A: 

The correct way to handle streams in Java (at least until Java 7) is:

InputStream is;
try {
   is = ...
} catch (IOEXception ex) {
   // report exception - print, or throw a wrapper
} finally {
   try {
      is.close();
   } catch (IOException ex) {}
}

(possibly logging the exception in the second catch as well)

If you don't close your streams, the memory won't be freed by the garbage collector.

Bozho
+1  A: 

When constructing a ServletFileUpload, you should pass in a FileItemFactory object (specifically, a DiskFileItemFactory) that you configure yourself, rather than relying on the defaults. The defaults may not be suitable for your requirements, especially in high-volume production environments.

skaffman
He's already using the ["streaming mode"](http://commons.apache.org/fileupload/streaming.html). He however doesn't take benefit of it by already allocating too much memory himself.
BalusC
A: 

Here

ArrayList<Integer> byteArray = new ArrayList<Integer>();
int tempByte;
do {
 tempByte = stream.read();
 byteArray.add(tempByte);

you are writing every byte straight into memory in an array of integers! Every integer consumes 4 bytes of memory while you just need one byte of it for every read byte. Effectively you should be using ArrayList<Byte> or better byte[] instead since every byte costs only one byte of memory, but that would per saldo still allocate as much memory as the file large is.

And here

byte[] bytes = new byte[byteArray.size()];

you're allocating afterwards as much memory as the file large is. Per saldo you're with both ArrayList<Integer> and byte[] allocating 5 times as much memory as the file large is.

It's a waste.

You should write it to an OutputStream immediately, e.g. FileOutputStream.

InputStream input = null;
OutputStream output = null;
try {
    input = license.openStream();
    output = new FileOutputStream("/file.ext");
    byte[] buffer = new byte[1024];
    for (int length; (length = input.read(buffer)) > 0;) {
        output.write(buffer, 0, length);
    }
} finally {
    if (output != null) try { output.close(); } catch (IOException logOrIgnore) {}
    if (input != null) try { input.close(); } catch (IOException logOrIgnore) {}
}

This costs effectively only 1KB of memory for the buffer instead of the whole file length of bytes (or 4 times of it when using integers).

Or if you really want to have it in a byte[] then just skip the whole ArrayList<Integer> step. It makes no sense. Use an ByteArrayOutputStream as OutputStream.

InputStream input = null;
ByteArrayOutputStream output = null;
try {
    input = license.openStream();
    output = new ByteArrayOutputStream();
    byte[] buffer = new byte[1024];
    for (int length; (length = input.read(buffer)) > 0;) {
        output.write(buffer, 0, length);
    }
} finally {
    if (output != null) try { output.close(); } catch (IOException logOrIgnore) {}
    if (input != null) try { input.close(); } catch (IOException logOrIgnore) {}
}

byte[] bytes = output.toByteArray();

This however still costs as much memory as the file large is, it's only now not 5 times of file size anymore as you initially did with ArrayList<Integer> and a byte[] afterwards.


Update: as per your comment you'd like to store this in the database. You can also do this without storing the whole file in Java's memory. Just write the obtained InputStream immediately to the DB using PreparedStatement#setBinaryStream().

final String SQL = "INSERT INTO file (filename, contentType, content) VALUES (?, ?, ?)";
String filename = FilenameUtils.getName(license.getName());
InputStream input = license.openStream();

Connection connection = null;
PreparedStatement statement = null;
try {
    connection = database.getConnection();
    statement = connection.prepareStatement(SQL);
    statement.setString(1, filename);
    statement.setString(2, getServletContext().getMimeType(filename));
    statement.setBinaryStream(3, input);
    statement.executeUpdate();
} catch (SQLException e) {
    throw new ServletException("Saving file in DB failed", e);
} finally {
    if (statement != null) try { statement.close(); } catch (SQLException logOrIgnore) {}
    if (connection != null) try { connection .close(); } catch (SQLException logOrIgnore) {}
}
BalusC
Thanks for this example!The reason why I put everything in an integer array is because the read() method returns an int. Looking at the JavaDoc it looks like it returns an int between 0 and 255, so I suppose I could safely cast to a byte. I think you're last example is what I need, since I need the file as a whole to store it in the database as a blog.Thank You!
Scott
In that case, using `PreparedStatement#setBinaryStream()` is most memory efficient.
BalusC
This makes sense, however I'm using Hibernate for persistence. Is there a way to store it through hibernate without holding it all in memory?
Scott
Use [`java.sql.Blob`](http://java.sun.com/javase/6/docs/api/java/sql/Blob.html) as data type and [`Hibernate#createBlob()`](http://docs.jboss.org/hibernate/stable/core/api/org/hibernate/Hibernate.html#createBlob%28java.io.InputStream,%20long,%20org.hibernate.Session%29) to set it and [`Blob#getInputStream()`](http://java.sun.com/javase/6/docs/api/java/sql/Blob.html#getBinaryStream%28%29) to get it. Google gives lot of code examples on "hibernate.createblob".
BalusC