tags:

views:

153

answers:

5

I would like to know how can I read a file byte by byte then perform some operation every n bytes.

for example:

Say I have a file of size = 50 bytes, I want to divide it into blocks each of n bytes. Then each block is sent to a function for some operations to be done on those bytes. The blocks are to be created during the read process and sent to the function when the block reaches n bytes so that I don`t use much memory for storing all blocks.

I want the output of the function to be written/appended on a new file.

This is what I've reached to read, yet I don't know it it is right:

fc = new JFileChooser();
File f = fc.getSelectedFile();
FileInputStream in = new FileInputStream(f);
byte[] b = new byte[16];
in.read(b);

I haven't done anything yet for the write process.

+1  A: 

So far your code looks OK. For reading binary files (as opposed to text files) you should indeed use FileInputStream (for reading text files, you should use a Reader, such as FileReader).

Note that you should check the return value from in.read(b);, because it might read less than 16 bytes if there are less than 16 bytes left at the end of the file.

Ofcourse you should add a loop to the program that keeps reading blocks of bytes until you reach the end of the file.

To write data to a binary file, use FileOutputStream. That class has a constructor that you can pass a flag to indicate that you want to append to an existing file:

FileOutputStream out = new FileOutputStream("output.bin", true);

Also, don't forget to call close() on the FileInputStream and FileOutputStream when you are done.

See the Java API documentation, especially the classes in the java.io package.

Jesper
+1  A: 

It's a start.

You should check what read() returns. It can read fewer bytes than the size of the array, and also indicate that the end of the file is reached.

Obviously, you need to read() in a loop...

It might be a good idea to reuse the array, but that requires that the part that reads the array copies what it needs, rather than just keeping a reference to the array.

KarlP
thanks Karlp, what I will do that I`ll reinitialize the byte array and the previous data will be processed and the process result wil lbe written to a file.
sikas
+3  A: 

You're on the right lines. Consider wrapping your FileInputStream with a BufferedInputStream, which improve I/O efficiency by reading the file in chunks.

The next step is to check the number of bytes read (returned by your call to read) and to hand-off the array to the processing function. Obviously you'll need to pass the number of bytes read to this method too in case the array was only partially populated.

Adamski
can you give me an example on the BufferedInputStream?
sikas
Your code would be idential except for the initialisation: InputStream is = new BufferedInputStream(new FileInputStream(file));
Adamski
A: 

I think this is what you migth need

void readFile(String path, int n) {
    try {
        File f = new File(path);
        FileInputStream fis = new FileInputStream(f);
        int ret = 0;
        byte[] array = new byte[n];
        while(ret > -1) {
            ret = fis.read(array);
            doSomething(array, ret);
        }
                    fis.close();
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}
luizbag
This is not correct. @sikas wants the blocks to be "sent to the function when the block reaches *n* bytes". `fis.read(array)` can read less than *n* bytes and not be at EOF.
Daniel Trebbien
+1  A: 

I believe that this will work:

final int blockSize = // some calculation
byte[] block = new byte[blockSize];
InputStream is = new FileInputStream(f);
try {
    int ret = -1;
    do {
        int bytesRead = 0;
        while (bytesRead < blockSize) {
            ret = is.read(block, bytesRead, blockSize - bytesRead);
            if (ret < 0)
                break; // no more data
            bytesRead += ret;
        }

        myFunction(block, bytesRead);
    } while (0 <= ret);
}
finally {
    is.close();
}

This code will call myFunction with blockSize bytes for all but possibly the last invocation.

Daniel Trebbien
+1 Beat me to it. You can also label the outer loop and break out of it from within the inner loop, allowing you to limit the scope of `ret`.
Mark Peters
@MarkPeters, I think that you have to break from the innermost loop with this code. Otherwise, `myFunction` will not be called if EOF is reached before reading in a complete block.
Daniel Trebbien
@Daniel Trebbien: Ah, just different interpretations of the requirements then. I didn't think it would be desired to run the function on incomplete data; but you're right in that I missed that they weren't functionally equivalent.
Mark Peters