ansaurus

Question

Non-Deprecated StringBufferInputStream equivalent

Answer 1

+1 A:

String s = "test";
InputStream input = new ByteArrayInputStream(s.getBytes("UTF8"));

Kevin 2010-01-28 22:53:42

This won't work. `readConfiguration()` passes the stream to `Properties#load(InputStream)`, and that method expects the stream to be `ISO-8859-1`, not `UTF-8`.

Alan Moore 2010-01-29 04:06:05

Then he can pass the ISO-8859-1 encoding to getBytes()

Kevin 2010-01-29 14:00:17

Answer 2

+2 A:

Use the ByteArrayInputStream, and be careful to specify an appropriate character encoding. e.g.

ByteArrayInputStream(str.getBytes("UTF8"));

You need to worry about the character encoding to determine how each character is converted to a set of bytes. Note you can use the default getBytes() method and specify the encoding the JVM runs with via -Dfile.encoding=...

Brian Agnew 2010-01-28 22:57:37

This won't work. `readConfiguration()` passes the stream to `Properties#load(InputStream)`, and that method expects the stream to be `ISO-8859-1`, not `UTF-8`.

Alan Moore 2010-01-29 04:04:49

Answer 3

+1 A:

Documentation of LogManager.readConfiguration() says that it accepts data in java.util.Properties format. So, the really correct encoding-safe implementation is this:

String s = ...;

StringBuilder propertiesEncoded = new StringBuilder();
for (int i = 0; i < s.length(); i++)
{
    char c = s.charAt(i);
    if (c <= 0x7e) propertiesEncoded.append((char) c);
    else propertiesEncoded.append(String.format("\\u%04x", (int) c)); 
}
ByteArrayInputStream in = new ByteArrayInputStream(propertiesEncoded.toString().getBytes("ISO-8859-1"));

EDIT: Encoding algorithm corrected

EDIT2: Actually, java.util.Properties format have some other restrictions (such as escaping of \ and other special characters), see docs

EDIT3: 0x00-0x1f escaping removed, as Alan Moore suggests

axtavt 2010-01-28 23:10:20

Good catch on unicode encoding specific to the properties file format. I believe this is correct as long as `s` is UTF-8.

Kaleb Pederson 2010-01-28 23:32:24

@Kaleb: `s` will just be a String, in the same encoding Strings always use--you don't need to worry about that. You only have to know the *target* encoding, which is `ISO-8859-1` as @axtavt said.

Alan Moore 2010-01-29 03:57:27

@axtavt: Your code will Unicode-escape all TAB, linefeed, form-feed, and carriage-return characters, which isn't right. Those only need to be escaped if they're part of a Properties key or element, and that should already have been taken care of by the time this method gets called.

Alan Moore 2010-01-29 04:12:21

ansaurus

tags:

views:

answers:

Non-Deprecated StringBufferInputStream equivalent

related questions