views:

57

answers:

2

How should I really go about implementing the following? I will have to handle a byte array that will contain text on several lines. The average size of the data is probably going to be around 10 kilobytes of data.

After unspecified amount of lines there will be a line starting with special token ("FIRSTSTRING"). Later somewhere on the same file there will be an other line also starting with a special token ("SECONDSTRING"). If both the first and second lines are defined in the byte array the second line should be copied in place of the first line. After that the resultant byte array should be returned.

Below is my first attempt. I did not refactor it to reduce complexity yet. I am concerned about reliablity and also very much about performance. It seems there are too many ways going around this and I lack experience required for judgement. I would really appreciate some good input on this.

 private byte[] handleHeader(final byte[] input) throws IOException {

  // input
  ByteArrayInputStream bais = new ByteArrayInputStream(input);
  InputStreamReader isr = new InputStreamReader(bais);
  BufferedReader brs = new BufferedReader (isr);
  // output
  ByteArrayOutputStream data = new ByteArrayOutputStream();
  ByteArrayOutputStream after = new ByteArrayOutputStream();

  String line=null;
  String original=null;
  String changeWith=null;

    while ((line = brs.readLine())!=null) {
        line+="\n";
          if (line.startsWith("FIRSTSTRING")) {
              original = line;
              continue;
          }
          if (line.startsWith("SECONDSTRING")) {
              changeWith = line;
              continue;
          }
          if ("".equals(original)) {
              data.write(line.getBytes());
          } else {
              after.write(line.getBytes());
          }

      }

if (changeWith!=null && original != null) {
    changeWith+="\n";
    data.write(changeWith.getBytes());
} else if (original != null){
    data.write(original.getBytes());
}

after.writeTo(data);

return data.toByteArray();
  }
A: 

i feel you can simplify the code by using either guava io library @ http://code.google.com/p/guava-libraries/ or commons-io library @ http://commons.apache.org/io/

Pangea
Looking at IOUtils, I can see couple useful things there... Definitely worth using. Interesting.
I would strongly recommend using Guava rather than commons-io. Guava is better organized, supports generics where applicable and strongly encourages not using platform-default encoding everywhere by requiring you to specify a `Charset` for any `String` <-> `byte[]` conversions.
ColinD
A: 

For starters it doesn't sound like you've defined your problem precisely - you say that there will be a "FIRSTSTRING" line, and there will be a "SECONDSTRING" line, but then you go on to say "if both lines are present"... If you know there will always be the second line things get a lot simpler

In any case, an algorithm like the following should be reasonably easy to implement and understand later, and shouldn't be too inefficient:

  • Create a StringBuilder to hold the overall output.
  • Iterate through the lines, adding all "normal" lines straight to the output.
  • When(/if) you encounter the "FIRSTSTRING" line, store this in a separate variable and create a second StringBuilder to hold the "second half" of the text.
  • Continue iterating, adding all further normal lines to this second StringBuilder.
  • When(/if) you encounter the "SECONDSTRING" line, append this to the main output, then append the entirety of the second StringBuilder to the main output, then append the remaining lines to the main output.
  • If you reach the end of the file without finding the second string line, then append the saved FIRSTSTRING line to the overall output and follow it up with the contents of the second StringBuilder.

Oh, and you're turning bytes into Strings without specifying an explicit character encoding. Never do that. If you know what the character encoding is, specific it explicitly (in the InputStreamReader's constructor). If you don't know what the character encoding of the stream of bytes is, then you cannot read it reliably at all.

Andrzej Doyle