ansaurus

Question

Answer 1

A:

100 files * 500 kB is something around 50 MB. If maximum heap size is 64 MB I'm pretty sure this code won't work in such conditions.

hudolejev 2010-05-21 09:28:46

Answer 2

+3 A:

This code merges all the PDF's in an array in the memory (the heap) so yes, memory usage will grow linearly with the number of files merged.

I don't know about the freeReader method, but maybe you could try to write the merged PDF into a temporary file instead of a byte array ? mergedPdfStream would be a FileOutputStream instead of a ByteArrayOutputStream. Then you return e.g. a File reference to the client code.

Or you could increase the quantity of memory Java can use (-Xmx JVM parameter), but if the number of files to merge eventually increases, you will find yourself with the same problem.

Pierre Henry 2010-05-21 09:34:48

thanks Pierre Henry.If i write to the FileOutputStream and return the file to the client, do not i risk myself revealing the file directly to the client? how do i get around this? do i again read the file and write to the stream? any other suggestions?

Vijay 2010-05-21 10:43:28

@Vijay - but then, why not write to the client rigth away? Only this way will you be able to serve PDF files that are larger than the memory allocated to your JVM.

Ingo 2010-05-21 13:11:10

@Ingo : if the file is destined to be read by the client only yes, but maybe it also needs to be stored.@Vijay : in my comment I meant "client code" : the code that needs that is calling the merge, not the actual client of the application. So my idea was : do the merge, save it to file. Then you have the file and can do whatever is needed with it : leave it on the server to be stored, or read it with a new Input stream and write to the client's request output stream to send him the file...

Pierre Henry 2010-05-25 07:15:52

Answer 3

A:

First, why do you clutter your code with all those Iterator<> boilerplate code? Do you ever heard of the for statement? i.e

for (PDfReader pdfReader: readers) { 
      // code for each single PDF reader in readers
}

Second: consider to close the pdfReader as soon as it is done. This will hopefully flush some buffers and free the memory occupied by the original PDF.

Ingo 2010-05-21 09:43:05

Answer 4

+1 A:

This is not proper way of doing file operation. You are doing merging of files using ArrayList and Array in memory. You should rather use File IO with buffering techniques.

Do you wish to show the final merged file at last? Then you can open the file after all your merging is done.

Do not use only in-memory buffering as you have shown. Use File Io with buffering (byte[] i mean)
Close each file after you read it and append it.

Java has limited memory you allocated at startup time, so merging some big number of file at once like this will lead to crashing of application. You should try this merging operation in separate thread using ThreadPool, so that your application will not get stucked for this.

thanks.

Paarth 2010-05-21 10:03:24

dear Paarth,yes you are correct. I would like to write the final merged file to the client. you want me to write to the file with buffer, than in-memory? if i open the file finally, after merging, to write it to the stream, can u please explain little more? as this module runs in the web environment how do i implement this in thread?

Vijay 2010-05-21 11:37:45

hey,I meant, make anew file, append your all files to it one by one. You can not do it concurrently (Means 1 thread for each PDF file you want to merge) because this is sequential operation.What i meant is make a new thread for merging functionality. The whole thing in a new thread.Please check the following link which shows merging using iText.http://sanjaal.com/java/2010/02/04/merging-two-or-more-pdfs-using-lowagie-itext-api/

Paarth 2010-05-21 11:44:24

FYI:the "List<InputStream> pdfStreams" contains the FileInputStream objects, each element containing the file to be merged into.

Vijay 2010-05-21 11:44:30

ok paarth, il rewrite this to perform merging in a disk file with buffer than in-memory. but can u plz explain me how do i open the merged file, read it and write to the servlet stream? buffering can be useful, right?

Vijay 2010-05-21 11:54:48

i have one more doubt, the idea is to create a file each time and delete it once written to the client, what about multiple requests? clashes to the file between different requests?

Vijay 2010-05-21 12:04:56

hey,I hope the List you maintaining for streams is not living for more time. Otherwise it can degrade performance right?Ok, do you want to send file to browser using servlet outputStream? You can do that using writing to stream directly (Try and hope it should work for big files. Set proper content type)Or you can make an applet for file downloading. means applet will load on client side, download files on client side and then merge them :). That applet should make final file at place where client wants.got my point?thanks.

Paarth 2010-05-21 12:09:24

wow, that s a different thought! il try with the 2nd option u said, download an applet merge in the client side. thanks a lot for supporting me.

Vijay 2010-05-21 19:48:28

thanks mate, its for what we all are here...:) to help each other and learn new thing.thanks.

Paarth 2010-05-22 06:51:41

paarth, i tried the site you referred for pdf merge, I got an idea, while the pdfs are merged, i write it immediately to the servletoutputstream. that is writer = new PdfCopy(document, outputStream);to the outputstream object i pass the servletoutputstream instance. one the merge was done il flush the stream and close it. each time the pdf merge it writes directly to the stream, not store. how about this?

Vijay 2010-05-22 09:26:09

ansaurus

tags:

views:

answers:

OutOfMemoryError during the pdf merge

related questions