views:

107

answers:

2

What's the best way to record the size of certain objects as they are being serialized? For example, once objects of type A, B, C are serialized, record the size of their serialized bytes. We can get the size of the entire object graph via getBytes, but we'd like to break it down as to what are the largest contributors to the overall serialized size.

ObjectOutputStream offers writeObjectOverride, but we don't want to rewrite the serialization process. In simplified terms, we need to be aware of when we encounter a certain object prior to serialization, record the total current byte count, and then after it's serialized, take the difference of byte counts. It seems like encompassing writeSerialData would work, but the method is private.

Ideas?

Thanks.

--- UPDATE ---

The answers/suggestions below are insightful. Below is what I have so far. Let me know your thoughts. Thanks.

// extend to get a handle on outputstream    
MyObjectOutputStream extends ObjectOutputStream {
    private OutputStream out;

    public MyObjectOutputStream(out) {
      super(out);
      this.out = out;
    }     

    public OutputStream getOut() {
        return this.out;
    }
}


// counter
public static class CounterOutputStream extends FilterOutputStream {
    private int bytesWritten = 0;
    ...    
    public int getBytesWritten() {
        return this.bytesWritten;
    }

    public void resetCounter() {
        bytesWritten = 0;
    }

    private void update(int len) {
        bytesWritten += len;
    }
}

// go serialize    
ByteArrayOutputStream out = new ByteArrayOutputStream();
ObjectOutputStream oos = new MyObjectOutputStream(new CounterOutputStream(out, 1024));


// record serialized size of this class; do this for every interested class
public class MyInterestingObject {
...
  private void writeObject(ObjectOutputStream out) throws IOException {
      CounterOutputStream counter = null;
      if (out instanceof MyObjectOutputStream) {
          counter = (CounterOutputStream)((MyObjectOutputStream)out).getOut();
          counter.resetCounter();
      }

      // continue w/ standard serialization of this object
      out.defaultWriteObject();

      if (counter != null) {
          logger.info(this.getClass() + " bytes written: " + counter.getBytesWritten());    
         // TODO: store in context or somewhere to be aggregated post-serialization
      }
  }
}
+1  A: 

The simplest solution would be to wrap the OutputStream you're using with an implementation that will count bytes written.

import java.io.IOException;
import java.io.OutputStream;

public class CountingOutputStream extends OutputStream {
    private int count;
    private OutputStream out;

    public CountingOutputStream(OutputStream out) {
        this.out = out;
    }

    public void write(byte[] b) throws IOException {
        out.write(b);
        count += b.length;
    }

    public void write(byte[] b, int off, int len) throws IOException {
        out.write(b, off, len);
        count += len; 
    }

    public void flush() throws IOException {
        out.flush();    
    }

    public void close() throws IOException {
        out.close();
    }

    public void write(int b) throws IOException {
        out.write(b);
        count++;
    }

    public int getBytesWritten() {
        return count;
    }
}

Then you would just use that

CountingOutputStream s = new CountingOutputStream(out);
ObjectOutputStream o = new ObjectOutputStream(s);
o.write(new Object());
o.close();
// s.getBytesWritten()
Lajcik
Counting bytes is fairly straight-forward, but I'm missing how this determines which object we're writing.
cwall
That wrapper could register somewhere: reference and class of the object, with the partial size. If it use a tree like structure it could record "this object writes from X to Y bytes". That's the accumulated size of the object. If you substract the size of the inner objects (that objects serialized between X and Y, it is, objects between writeObject starts and ends) you have the net size of the object. :) Hope it helps!
helios
Of course, you have to use an inherited ObjectOutputStream. Because it uses its own writeObject to serialize sub-objects.
helios
Thanks helios. It's not quite clear. If I understand you correctly, ObjectOutputStream.writeObject is final, so overwriting and tracking bytes written before and after is not an option. But, I think my updated solution is fairly similar to what you're saying. In the desired objects, the class's writeObject would be call that would set a separate counter setting to 0 before the serialization of the said object and then record the total after serialization is finished. Not ideal because each object must contain this code and managing the separate counter can be tricky.
cwall
A: 

You could implement Externalizable rather than Serializable on any objects you need to capture such data from. You could then implement field-by-field byte counting in the writeExternal method, maybe by handing off to a utility class. Something like

public void writeExternal(ObjectOutput out) throws IOException
{
    super.writeExternal(out);
    out.writeUTF(this.myString == null ? "" : this.myString);
    ByteCounter.getInstance().log("MyClass", "myString", this.myString);
}

Another hackish way would be to stick with Serializable, but to use the readResolve or writeReplace hooks to capture whatever data you need, e.g.

public class Test implements Serializable
{
    private String s;

    public Test(String s)
    {
        this.s = s;
    }

    private Object readResolve()
    {
        System.err.format("%s,%s,%s,%d\n", "readResolve", "Test", "s", s.length());
        return this;
    }

    private Object writeReplace()
    {
        System.err.format("%s,%s,%s,%d\n", "writeReplace", "Test", "s", s.length());
        return this;
    }

    public static void main(String[] args) throws Exception
    {
        File tmp = File.createTempFile("foo", "tmp");
        ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(tmp));
        Test test = new Test("hello world");
        out.writeObject(test);
        out.close();
        ObjectInputStream in = new ObjectInputStream(new FileInputStream(tmp));
        test = (Test)in.readObject();
        in.close();
    }
}
sudocode
Interesting idea. Thanks. I'll look into. Admittedly, I'm surprised that there isn't a built-in mechanism, eg callbacks with restrictive privileges, to monitor default serialization. Externalizable is an option, but requires implementing the heavy lifting which is way more than needed for this use case.
cwall