tags:

views:

4099

answers:

7

Is there any reason to prefer a CharBuffer to a char[] in the following:

CharBuffer buf = CharBuffer.allocate(DEFAULT_BUFFER_SIZE);
while( in.read(buf) >= 0 ) {
  out.append( buf.flip() );
  buf.clear();
}

vs.

char[] buf = new char[DEFAULT_BUFFER_SIZE];
int n;
while( (n = in.read(buf)) >= 0 ) {
  out.write( buf, 0, n );
}

(where in is a Reader and out in a Writer)?

+3  A: 

If this is the only thing you're doing with the buffer, then the array is probably the better choice in this instance.

CharBuffer has lots of extra chrome on it, but none of it is relevant in this case - and will only slow things down a fraction.

You can always refactor later if you need to make things more complicated.

Bill Michell
That's a bad philosophy. Get it right the first time.
coppro
This is right for the current requirement. Requirements change.
Ed Swangren
Using a standard implementation (dare I say pattern :-) ) results in less error-prone code. Does't mean that using an array is wrong or buggy, it is just more likely to be so.
Michael Rutherfurd
For the record, I disagree with "get it right first time". Get it right *for this requirement* the first time, while having confidence that when the requirement changes I can change things then, is much better provided you can make the environment support that philosophy.
Bill Michell
+7  A: 

No, there's really no reason to prefer a CharBuffer in this case.

In general, though, CharBuffer (and ByteBuffer) can really simplify APIs and encourage correct processing. If you were designing a public API, it's definitely worth considering a buffer-oriented API.

erickson
+1  A: 

The CharBuffer version is slightly less complicated (one less variable), encapsulates buffer size handling and makes use of a standard API. Generally I would prefer this.

However there is still one good reason to prefer the array version, in some cases at least. CharBuffer was only introduced in Java 1.4 so if you are deploying to an earlier version you can't use Charbuffer (unless you role-your-own/use a backport).

P.S If you use a backport remember to remove it once you catch up to the version containing the "real" version of the backported code.

Michael Rutherfurd
+1  A: 

I think that CharBuffer and ByteBuffer (as well as any other xBuffer) were meant for reusability so you can buf.clear() them instead of going through reallocation every time

If you don't reuse them, you're not using their full potential and it will add extra overhead. However if you're planning on scaling this function this might be a good idea to keep them there

Eric
You can reuse arrays.
Jonathan
The full potential of buffers is the direct-buffers and the ability to change data representations easily. You can use ByteArray.asTYPE() to on-the-fly convert bytes to numbers or strings. You can also change the byte order as well.
James Schek
+3  A: 
Ron Tuffin
On my 2007 MacBookPro with Java 1.6 the second version is only(!) 35% slower - 2700ms vs 2000ms.
Alnitak
Yea. My times above were using 1.5 the times for I get for 1.6 are faster. About 35% as you (@Alnitak) report.
Ron Tuffin
Why write(buf.array(),0,n) instead of write(buf.flip())?
Chris Conway
Because StringWriter.write(Buffer) doesn't exist.
Alnitak
My bad. Why not append(buf.flip())?
Chris Conway
because Write.append() doesn't exist either - the .append() method is only in the StringWriter subclass.
Alnitak
Writer implements Appendable since JDK1.5
Chris Conway
so it does - my bad.
Alnitak
@Chris Conway: I used write(buff.array(),0,n) because I wanted to eliminate as many differences as possible between the two.
Ron Tuffin
ok so I replaced out2.write(buff.array(),0,n); with out2.append((CharBuffer)buff.flip()); that made the time comparisons worse an increase of 135% - Bah! Go with a char[] it is clearly faster in this case. :)
Ron Tuffin
Your micro-benchmark measures too many things. StringWriter is allocated without an argument so it has to resize itself. StringWriter is backed by StringBuffer and defaults to 16. Try allocating it with the argument string.length().
James Schek
Thanks @James Schek. I have done that and updated the 'answer'. Much better performance overall.
Ron Tuffin
+2  A: 

The difference, in practice, is actually <10%, not 30% as others are reporting.

To read and write a 5MB file 24 times, my numbers taken using a Profiler. They were on average:

char[] = 4139 ms
CharBuffer = 4466 ms
ByteBuffer = 938 (direct) ms

Individual tests a couple times favored CharBuffer.

I also tried replacing the File-based IO with In-Memory IO and the performance was similar. If you are trying to transfer from one native stream to another, then you are better off using a "direct" ByteBuffer.

With less than 10% performance difference, in practice, I would favor the CharBuffer. It's syntax is clearer, there's less extraneous variables, and you can do more direct manipulation on it (i.e. anything that asks for a CharSequence).

Benchmark is below... it is slightly wrong as the BufferedReader is allocated inside the test-method rather than outside... however, the example below allows you to isolate the IO time and eliminate factors like a string or byte stream resizing its internal memory buffer, etc.

public static void main(String[] args) throws Exception {
    File f = getBytes(5000000);
    System.out.println(f.getAbsolutePath());
    try {
        System.gc();
        List<Main> impls = new java.util.ArrayList<Main>();
        impls.add(new CharArrayImpl());
        //impls.add(new CharArrayNoBuffImpl());
        impls.add(new CharBufferImpl());
        //impls.add(new CharBufferNoBuffImpl());
        impls.add(new ByteBufferDirectImpl());
        //impls.add(new CharBufferDirectImpl());
        for (int i = 0; i < 25; i++) {
            for (Main impl : impls) {
                test(f, impl);
            }
            System.out.println("-----");
            if(i==0)
                continue; //reset profiler
        }
        System.gc();
        System.out.println("Finished");
        return;
    } finally {
        f.delete();
    }
}
static int BUFFER_SIZE = 1000;

static File getBytes(int size) throws IOException {
    File f = File.createTempFile("input", ".txt");
    FileWriter writer = new FileWriter(f);
    Random r = new Random();
    for (int i = 0; i < size; i++) {
        writer.write(Integer.toString(5));
    }
    writer.close();
    return f;
}

static void test(File f, Main impl) throws IOException {
    InputStream in = new FileInputStream(f);
    File fout = File.createTempFile("output", ".txt");
    try {
        OutputStream out = new FileOutputStream(fout, false);
        try {
            long start = System.currentTimeMillis();
            impl.runTest(in, out);
            long end = System.currentTimeMillis();
            System.out.println(impl.getClass().getName() + " = " + (end - start) + "ms");
        } finally {
            out.close();
        }
    } finally {
        fout.delete();
        in.close();
    }
}

public abstract void runTest(InputStream ins, OutputStream outs) throws IOException;

public static class CharArrayImpl extends Main {

    char[] buff = new char[BUFFER_SIZE];

    public void runTest(InputStream ins, OutputStream outs) throws IOException {
        Reader in = new BufferedReader(new InputStreamReader(ins));
        Writer out = new BufferedWriter(new OutputStreamWriter(outs));
        int n;
        while ((n = in.read(buff)) >= 0) {
            out.write(buff, 0, n);
        }
    }
}

public static class CharBufferImpl extends Main {

    CharBuffer buff = CharBuffer.allocate(BUFFER_SIZE);

    public void runTest(InputStream ins, OutputStream outs) throws IOException {
        Reader in = new BufferedReader(new InputStreamReader(ins));
        Writer out = new BufferedWriter(new OutputStreamWriter(outs));
        int n;
        while ((n = in.read(buff)) >= 0) {
            buff.flip();
            out.append(buff);
            buff.clear();
        }
    }
}

public static class ByteBufferDirectImpl extends Main {

    ByteBuffer buff = ByteBuffer.allocateDirect(BUFFER_SIZE * 2);

    public void runTest(InputStream ins, OutputStream outs) throws IOException {
        ReadableByteChannel in = Channels.newChannel(ins);
        WritableByteChannel out = Channels.newChannel(outs);
        int n;
        while ((n = in.read(buff)) >= 0) {
            buff.flip();
            out.write(buff);
            buff.clear();
        }
    }
}
James Schek
A: 

You should avoid CharBuffer in recent Java versions, there is a bug in #subsequence(). You cannot get a subsequence from the second half of the buffer since the implementation confuses capacity and remaining. I observed the bug in java 6-0-11 and 6-0-12.

Adrian