tags:

views:

54

answers:

4

I have a program that need to handle byte array source. Originally the program work fine when the byte array size is 3000 byte. Now data size increase and the array size need to be changed from 3000 to 30000 (10 times).

I make a sample benchmark program to test looping time. I suppose required CPU time should be increase linearly according to the array size, but the benchmark program show that process 30000 bytes require much more than 35 times compared to process 3000 bytes.

Here is my bench mark program. Could the program improve so that it use around 10 times CPU time only?

public static void main(String args[])

    int TestArraySize=30000;
    String strFinalMessage="";

    // create a dummy byte array
    byte[] bytearrayMessageContent = new byte[TestArraySize];
    for (int i=0; i<TestArraySize; i++) {
        // fill character A-J into the dummy array
        bytearrayMessageContent[i] = (byte) (i%10+65); 
    }
    System.out.println(bytearrayMessageContent.length);

    // time start time
    long lngCurrentTime = System.currentTimeMillis();

    // process the byte array
    int intTHMessageLenAdj = TestArraySize;
    try {
        InputStream input = new ByteArrayInputStream(bytearrayMessageContent);
        while (intTHMessageLenAdj > 0) {
            // get random length of bytes to process
            int RandomLength = getNextRandom();
            if (RandomLength > intTHMessageLenAdj) {
                RandomLength = intTHMessageLenAdj;
            }

            // get the bytes to be process in a byte array and process it
            byte[] bytearrayMsgTrunk = new byte[RandomLength];
            input.read(bytearrayMsgTrunk);
            // do some logic here
            strFinalMessage += new String(bytearrayMsgTrunk) + "||";

            // repeat looping until all bytes are read
            intTHMessageLenAdj -= RandomLength;
        }
        input.close();  
    } catch (Exception ex) {
        ex.printStackTrace();
    }

    // time end time
    lngCurrentTime = System.currentTimeMillis() - lngCurrentTime;
    //System.out.println(strFinalMessage);
    System.out.println(lngCurrentTime);
}

public static int getNextRandom() { 
    // info is arround 4 bytes size
    Random random = new Random();
    return random.nextInt(8);
}
+2  A: 

Well, there are several problems here:

  • You're using the default platform encoding for strings. Don't do that. Specify a particular string encoding to convert between bytes and text.
  • Don't concatenate strings in a loop like this - use StringBuilder.
  • You're ignoring the return value of InputStream.Read. That may be okay when reading from ByteArrayInputStream, but you generally shouldn't rely on it.
  • You're creating a new instance of Random each time. I believe this is okay as of Java 6, but will give you repeated values in earlier versions. It's generally a bad idea, basically. Use one instance of Random repeatedly.
Jon Skeet
and he's creating a new Random object on every call to getNextRandom()
Thomas Lötzer
@Thomas: Yes, I spotted that in my most recent edit :)
Jon Skeet
+2  A: 

I suppose required CPU time should be increase linearly according to the array size, but the benchmark program show that process 30000 bytes require much more than 35 times compared to process 3000 bytes.

Actually, I'd expect it to grow quadratically with the array size. If you were to profile the program, you'd probably find that a significant proportion of the time is going in calls to String.concat. And as the array gets bigger, the proportion will increase.

Basically, each time you do a String concatenation, you copy all of the characters you've accumulated so far into a new String ... and throw away the previous one. It is not hard to see that that part of the code is O(N**2), where N is the array size.

Replace the String declaration and concatenations with this:

// allocate the builder with extra space to hold the '||' characters
StringBuilder sb= new StringBuilder(TestArraySize * 3 / 2);
...
// this replaces the concatenation.
sb.append(new String(bytearrayMsgTrunk);
sb.append("||");
...
// this does a final copy of the characters to create a new String
String strFinalMessage = sb.toString();
Stephen C
A: 

Thanks all, your answer give me direction to solve my problem.

Using StringBuilder could decrease the looping time dramatically. Originally the code is created quite long time ago and full of "old style" coding...

Good to see the problem was solved, but I'd like to correct you on "old style". Concatenating strings was already bad in Java 1.0, and `StringBuffer` (the close ancestor of `StringBuilder`) was also available then. The creator of the original code is one of many people who didn't understand the performance implications of concatenation, or didn't care. The point is obviously moot, but "old style" is not a valid excuse in this case.
Carl Smotricz
A: 

Why do you think you need to increase your buffer size because the data size increases? That just ain't so.

EJP