ansaurus

Question

Answer 1

A:

PDFs may contain binary data and chances are it's getting mangled when you do ToString. It seems to me that you want this:

        FileInputStream inputStream = new FileInputStream(sourcePath);

        int numberBytes = inputStream .available();
        byte bytearray[] = new byte[numberBytes];

        inputStream .read(bytearray);

plinth 2009-07-15 12:35:49

That's a horrible way of reading data - please don't assume that available() will contain all of the data in a stream.

Jon Skeet 2009-07-15 12:39:43

@Jon - seconded. available() will (usually) return the number of bytes that can be read immediately without blocking. It has little to do with how much data is actually in the file..

Eric Petroelje 2009-07-15 12:42:34

Answer 2

+6 A:

The problem is that you are calling toString() on the InputStream object itself. This will return a String representation of the InputStream object not the actual PDF document.

You want to read the PDF only as bytes as PDF is a binary format. You will then be able to write out that same byte array and it will be a valid PDF as it has not been modified.

e.g. to read a file as bytes

File file = new File(sourcePath);
InputStream inputStream = new FileInputStream(file); 
byte[] bytes = new byte[file.length()];
inputStream.read(bytes);

Mark 2009-07-15 12:36:08

Answer 3

+1 A:

Calling toString() on an InputStream doesn't do what you think it does. Even if it did, a PDF contains binary data, so you wouldn't want to convert it to a string first.

What you need to do is read from the stream, write the results into a ByteArrayOutputStream, then convert the ByteArrayOutputStream into an actual byte array by calling toByteArray():

InputStream inputStream = new FileInputStream(sourcePath);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();

int data;
while( (data = inputStream.read()) >= 0 ) {
    outputStream.write(data);
}

inputStream.close();
return outputStream.toByteArray();

Eric Petroelje 2009-07-15 12:40:58

Reading a single byte at a time isn't terribly efficient. Better to copy a block at a time.

Jon Skeet 2009-07-15 12:44:34

@Jon - true, but I was trying to keep ti simple. Also, doesn't FileInputStream do buffering internally anyways that would mitigate that?

Eric Petroelje 2009-07-15 12:45:48

Answer 4

+5 A:

You basically need a helper method to read a stream into memory. This works pretty well:

public static byte[] readFully(InputStream stream) throws IOException
{
    byte[] buffer = new byte[8192];
    ByteArrayOutputStream baos = new ByteArrayOutputStream();

    int bytesRead;
    while ((bytesRead = stream.read(buffer) != -1)
    {
        baos.write(buffer, 0, bytesRead);
    }
    return baos.toByteArray();
}

Then you'd call it with:

public static byte[] loadFile(String sourcePath) throws IOException
{
    InputStream inputStream = null;
    try 
    {
        inputStream = new FileInputStream(sourcePath);
        return readFully(inputStream);
    } 
    finally
    {
        if (inputStream != null)
        {
            inputStream.close();
        }
    }
}

Don't mix up text and binary data - it only leads to tears.

Jon Skeet 2009-07-15 12:42:07

Answer 5

+1 A:

Are'nt you creating the pdf file but not actually writing the byte array back? Therefore you cannot open the PDF.

out = new FileOutputStream("D:/ABC_XYZ/1.pdf");
out.Write(b, 0, b.Length);
out.Position = 0;
out.Close();

This is in addition to correctly reading in the PDF to byte array.

David Liddle 2009-07-15 12:45:37

out.position=0 ?? I dint get it

2009-07-15 12:50:28

this may not have been useful as you are saving it to file but I ran into issues where I was putting the byte array into a MemoryStream object and downloading it to the client. I had to set the Position back to 0 for this to work.

David Liddle 2009-07-15 13:12:36

Answer 6

A:

THanks a lot for the responses.

Its working perfect. I used code given by David and Mark.

Thanks

2009-07-15 13:05:39

Answer 7

A:

Hi I tried converting byteArray into pdf, but fails to get the data in pdf. Getting an error as not a supported file or file is damaged or not correctly decoded. I have the code as below. Program runs fine and creates pdf, but not writing the content. Can anyone have clue on this ?

import java.io.FileOutputStream; import java.io.OutputStream;

public class GetPDFFromByteArray {

public static void main (String a[]){
    System.out.println("----Get PDF from ByteArray--------");
    String s = "Sample";
    byte[] c = s.getBytes();
    convertByteArrayToDoc(c);
}

public static void convertByteArrayToDoc(byte[] b) {
    System.out.println("------calling convertByteArrayToDoc------");
    OutputStream out;
    try {
            out = new FileOutputStream("C:/TestArea/1.pdf");
            out.write(b, 0, b.length);
            //out.position = 0;
            out.close();
            System.out.println("------Write Success------");
    }catch (Exception e) {
            System.out.println(e);
    }
}

}

Thanks, Vishnuvardhan.H

Vishnuvardhan 2010-02-23 19:20:12

ansaurus

tags:

views:

answers:

PDF to byte array and vice versa

related questions