views:

9043

answers:

7

Hi, I need to convert pdf to byte array and vice versa.

Can any one help me?

This is how I am converting to byte array

public static byte[] convertDocToByteArray(String sourcePath) {

      byte[] byteArray=null;
            try {
                  InputStream inputStream = new FileInputStream(sourcePath);


                  String inputStreamToString = inputStream.toString();
                  byteArray = inputStreamToString.getBytes();

                  inputStream.close();
            } catch (FileNotFoundException e) {
                 System.out.println("File Not found"+e);
            } catch (IOException e) {
             System.out.println("IO Ex"+e);
            }
            return byteArray;
      }

If I use following code to convert it back to document, pdf is getting created. But it's saying 'Bad Format. Not a pdf'.

public static void convertByteArrayToDoc(byte[] b) {


  OutputStream out;
  try {  
   out = new FileOutputStream("D:/ABC_XYZ/1.pdf");
  out.close();
  System.out.println("write success");
  }catch (Exception e) {
   System.out.println(e);
  }
A: 

PDFs may contain binary data and chances are it's getting mangled when you do ToString. It seems to me that you want this:

        FileInputStream inputStream = new FileInputStream(sourcePath);

        int numberBytes = inputStream .available();
        byte bytearray[] = new byte[numberBytes];

        inputStream .read(bytearray);
plinth
That's a horrible way of reading data - please don't assume that available() will contain all of the data in a stream.
Jon Skeet
@Jon - seconded. available() will (usually) return the number of bytes that can be read immediately without blocking. It has little to do with how much data is actually in the file..
Eric Petroelje
+6  A: 

The problem is that you are calling toString() on the InputStream object itself. This will return a String representation of the InputStream object not the actual PDF document.

You want to read the PDF only as bytes as PDF is a binary format. You will then be able to write out that same byte array and it will be a valid PDF as it has not been modified.

e.g. to read a file as bytes

File file = new File(sourcePath);
InputStream inputStream = new FileInputStream(file); 
byte[] bytes = new byte[file.length()];
inputStream.read(bytes);
Mark
+1  A: 

Calling toString() on an InputStream doesn't do what you think it does. Even if it did, a PDF contains binary data, so you wouldn't want to convert it to a string first.

What you need to do is read from the stream, write the results into a ByteArrayOutputStream, then convert the ByteArrayOutputStream into an actual byte array by calling toByteArray():

InputStream inputStream = new FileInputStream(sourcePath);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();

int data;
while( (data = inputStream.read()) >= 0 ) {
    outputStream.write(data);
}

inputStream.close();
return outputStream.toByteArray();
Eric Petroelje
Reading a single byte at a time isn't terribly efficient. Better to copy a block at a time.
Jon Skeet
@Jon - true, but I was trying to keep ti simple. Also, doesn't FileInputStream do buffering internally anyways that would mitigate that?
Eric Petroelje
+5  A: 

You basically need a helper method to read a stream into memory. This works pretty well:

public static byte[] readFully(InputStream stream) throws IOException
{
    byte[] buffer = new byte[8192];
    ByteArrayOutputStream baos = new ByteArrayOutputStream();

    int bytesRead;
    while ((bytesRead = stream.read(buffer) != -1)
    {
        baos.write(buffer, 0, bytesRead);
    }
    return baos.toByteArray();
}

Then you'd call it with:

public static byte[] loadFile(String sourcePath) throws IOException
{
    InputStream inputStream = null;
    try 
    {
        inputStream = new FileInputStream(sourcePath);
        return readFully(inputStream);
    } 
    finally
    {
        if (inputStream != null)
        {
            inputStream.close();
        }
    }
}

Don't mix up text and binary data - it only leads to tears.

Jon Skeet
+1  A: 

Are'nt you creating the pdf file but not actually writing the byte array back? Therefore you cannot open the PDF.

out = new FileOutputStream("D:/ABC_XYZ/1.pdf");
out.Write(b, 0, b.Length);
out.Position = 0;
out.Close();

This is in addition to correctly reading in the PDF to byte array.

David Liddle
out.position=0 ?? I dint get it
this may not have been useful as you are saving it to file but I ran into issues where I was putting the byte array into a MemoryStream object and downloading it to the client. I had to set the Position back to 0 for this to work.
David Liddle
A: 

THanks a lot for the responses.

Its working perfect. I used code given by David and Mark.

Thanks

A: 

Hi I tried converting byteArray into pdf, but fails to get the data in pdf. Getting an error as not a supported file or file is damaged or not correctly decoded. I have the code as below. Program runs fine and creates pdf, but not writing the content. Can anyone have clue on this ?

import java.io.FileOutputStream; import java.io.OutputStream;

public class GetPDFFromByteArray {

public static void main (String a[]){
    System.out.println("----Get PDF from ByteArray--------");
    String s = "Sample";
    byte[] c = s.getBytes();
    convertByteArrayToDoc(c);
}

public static void convertByteArrayToDoc(byte[] b) {
    System.out.println("------calling convertByteArrayToDoc------");
    OutputStream out;
    try {
            out = new FileOutputStream("C:/TestArea/1.pdf");
            out.write(b, 0, b.length);
            //out.position = 0;
            out.close();
            System.out.println("------Write Success------");
    }catch (Exception e) {
            System.out.println(e);
    }
}

}

Thanks, Vishnuvardhan.H

Vishnuvardhan