views:

109

answers:

1

We're thinking of using Protocol Buffers for binary logging because:

  • It's how we're encoding our objects anyway
  • It is relatively compact, fast to read / write etc.

That said, it isn't obvious how we should go about it because the APIs tend to focus on creating whole objects, so wrapping a list of DataLogEntry as a repeated field in a DataLogFile would be what you'd do in messaging terms but what we really want is just to be able to write and then read a whole DataLogEntry out, appending it to the end of a file.

The first issue we've hit by doing that is that doing this (in a test:

        FileInputStream fileIn = new FileInputStream(logFile);
        CodedInputStream in = CodedInputStream.newInstance(fileIn);
        while(!in.isAtEnd()) {
            DataLogEntry entry = DataLogEntry.parseFrom(in);
            // ... do stuff
        }

Only results in 1 DataLogEntry being read from the stream. Without the isAtEnd, it never stops.

Thoughts?

Edit: I've switched to using entry.writeDelimitedTo and BidLogEntry.parseDelimitedFrom and that seems to work...

+2  A: 

From my understanding of protocol buffers it does not support multiple messages in a single stream. So you will probably need to track the boundaries of the messages yourself. you can do this by storing the size of the message before each message in the log.

public class DataLog {

    public void write(final DataOutputStream out, final DataLogEntry entry) throws IOException {
        out.writeInt(entry.getSerializedSize());
        CodedOutputStream codedOut = CodedOutputStream.newInstance(out);
        entry.writeTo(codedOut);
        codedOut.flush();
    }

    public void read(final DataInputStream in) throws IOException {
        byte[] buffer = new byte[4096];
        while (true) {
            try {
                int size = in.readInt();
                CodedInputStream codedIn;
                if (size <= buffer.length) {
                    in.read(buffer, 0, size);
                    codedIn = CodedInputStream.newInstance(buffer, 0, size);
                } else {
                    byte[] tmp = new byte[size];
                    in.read(tmp);
                    codedIn = CodedInputStream.newInstance(tmp);
                }
                DataLogEntry.parseFrom(codedIn);
                // ... do stuff
            }
            catch (final EOFException e) {
                break;
            }
        }
    }
}

NB: I've used an EOFException to find the end of file, you may wish to use a delimiter or track the number of byte read manually.

Michael Barker
Awesome, thanks. Just what I needed to know. I wonder how fragile that is... I guess bad things would happen if I lost the beginning of the file...
jamie mccrindle
Hmmm... getting this error when trying to read out: Protocol message contained an invalid tag (zero).
jamie mccrindle
The CodedOutputStream needs to be flushed.
Michael Barker