views:

1629

answers:

6

What is the most effecient/elegant way of dumping a StringBuilder to a text file?

you can go:

outputStream.write(stringBuilder.toString().getBytes());

But is this efficient for a very long file?

Is there a better way?

+2  A: 

You should use a BufferedWriter to optimize the writes. If you weren't writing character data, you would use a BufferedOutputStream.

...
File file = new File("path/to/file.txt");
BufferedWriter writer = new BufferedWriter(new FileWriter(file));
writer.write(stringBuilder.toString());

EDIT: As mentioned by BalusC, you should write character data using a Writer instead of an OutputStream.

Since you're ultimately writing to a file, a better approach would be to write to the BufferedWriter more often instead of creating a huge StringBuilder in-memory and writing everything at the end (depending on your use-case, you might even be able to eliminate the StringBuilder entirely). Writing incrementally during processing will make better use of your limited I/O bandwidth, unless another thread is trying to read a lot of data from the disk at the same time you're writing.

rob
+3  A: 

You could use the Apache Commons IO library, which gives you FileUtils:

FileUtils.writeStringToFile(file,stringBuilder.toString())
Mike Sickler
I am choosing this as the preferred answer just because it abstracts the complications from me. Given that it might not be the most efficient. But the other answers are great too, might use them if efficiency starts to suffer.
Patrick
this has two major problems which I've explained in my answer.
Kevin Bourrillion
+3  A: 

For character data better use Reader/Writer. In your case, use a BufferedWriter. If possible, use BufferedWriter from the beginning on instead of StringBuilder to save memory.

Note that your way of calling the non-arg getBytes() method would use the platform default character encoding to decode the characters. This may fail if the platform default encoding is for example ISO-8859-1 while your String data contains characters outside the ISO-8859-1 charset. Better use the getBytes(charset) where in you can specify the charset yourself, such as UTF-8.

BalusC
+4  A: 

Well, if the string is huge, toString().getBytes() will create duplicate bytes (2 or 3 times). The size of the string.

To avoid this, you can extract chunk of the string and write it in separate parts.

Here is how it may looks:

final StringBuilder aSB = ...;
final int    aLength = aSB.length();
final int    aChunk  = 1024;
final char[] aChars  = new char[aChunk];

for(int aPosStart = 0; aPosStart < aLength; aPosStart += aChunk) {
    final int aPosEnd = Math.min(aPosStart + aChunk, aLength);
    aSB.getChars(aPosStart, aPosEnd, aChars, 0);                 // Create no new buffer
    final CharArrayReader aCARead = new CharArrayReader(aChars); // Create no new buffer

    // This may be slow but it will not create any more buffer (for bytes)
    int aByte;
    while((aByte = aCARead.read()) != -1)
        outputStream.write(aByte);
}

Hope this helps.

NawaMan
+1 for the most memory efficient solution.
BalusC
+1 It's not really slower, just tested it with a 50MB String. But it really saves memory. (approx. 2MB vs. 130MB for the other methods)
mhaller
@NawaMan The "big" performance differences come from the underlying OutputStream. In many cases the write(array) method call decomposes into a while-loop internally. Nice example.
pst
Is this more memory efficient than the .append solution? I figure writer may be doing similar stuff under the hood.
Thomas Ahle
@Thomas Ahle: As far as I know (and tried), append is the every if not the most efficient one. Another one that is very efficient (for Stream) is `write(byte)`. Java is open sourced now so you can see the code and as I remember the implementation of append and write are always related.
NawaMan
@NawaMan Yeah, just checked it, append(CharacterStream cs) = write(cs.toString())
Thomas Ahle
+5  A: 

As pointed out by others, use a Writer, and use a BufferedWriter, but then don't call writer.write(stringBuilder.toString());; instead just writer.append(stringBuilder);.

EDIT: But, I see that you accepted a different answer because it was a one-liner. But that solution has two problems:

One, it doesn't accept a java.nio.Charset. BAD. You should always specify a Charset explicitly.

Two, it's still making you suffer a stringBuilder.toString(). If the simplicity is really what you're after, try the following from the Guava project:

Files.write(stringBuilder, file, Charsets.UTF_8)

Kevin Bourrillion
writer.write(); does not take StringBuilder as argument. I can specify the encoding with FileUtils.writeStringToFile(file, String, String encoding)
Patrick
Terribly sorry -- that was supposed to say writer.append()! Fixing it.
Kevin Bourrillion
Also, thanks for clarifying about FileUtils, but still, specifying a Charset as a String is lame.
Kevin Bourrillion
+1. Is there any reason to use a BufferedWriter when we only do one write/append call?
Thomas Ahle
A: 

If the string itself is long, you definitely should avoid toString(), which makes another copy of the string. The most efficient way to write to stream should be something like this,

  OutputStreamWriter writer = new OutputStreamWriter(
    new BufferedOutputStream(outputStream), "utf-8");

  for (int i = 0; i < sb.length(); i++) {
   writer.write(sb.charAt(i));
  }
ZZ Coder
writer.append(sb)
Kevin Bourrillion
Please don't use writer.append(sb). It's same as writer.write(sb.toString()). So it defeats the purpose.
ZZ Coder