views:

407

answers:

2

I'd like to find out about the size of an array being allocated by looking at the bytecode, if that information is known at compile time, of course.

Background: I want to write a FindBugs detector (which looks at the compiled bytecode) and report certain occurences of array allocations. In order to filter out false positives I am not interested in "small" arrays, but only ones whose size is not available at compile time or that are larger than a configurable threshold.

As the FindBugs sourcecode is not too heavily documented, I am looking for some pointers on how to get started - maybe there already is a dectector doing something similar that I could look at.

+4  A: 

Well, if they are allocated based on a constant, you could check for a constant that was pushed just before the allocation. For example:

class ArraySize {
    private static final int smallsize = 10;
    private static final int largesize = 1000;
    public static void main(String[] args) {
        int[] small = new int[smallsize];
        int[] big = new int[largesize];
    }
}

gives the bytecode:

Compiled from "ArraySize.java"
class ArraySize extends java.lang.Object{
ArraySize();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return

public static void main(java.lang.String[]);
  Code:
   0:   bipush  10
   2:   newarray int
   4:   astore_1
   5:   sipush  1000
   8:   newarray int
   10:  astore_2
   11:  return

}
Michael Myers
Thanks, that's a start already. In your example it will inline the constants. It gets trickier when you have things like int x=10;new byte[5+x];Even though 15 is the obvious result, depending on the compiler (and its settings) this might end up with the calculation being done on the stack just before the new allocation. I'd like to know, if there is a standard way to get to this in FindBugs.
Daniel Schneller
I don't know the insides of FindBugs, sorry; all I know is a little about bytecode.
Michael Myers
+2  A: 

This could get kind of tricky. My knowledge is incomplete, but you'll have at least three kinds of instructions to look out for (NEWARRAY, ANEWARRAY and MULTIANEWARRAY). Looking at the previous instruction (or in the case of MULTIANEWARRAY, n previous instructions) gets the size, which even if it was a constant might be loaded with BIPUSH, SIPUSH or LDC (anything else?) depending on the size. As you've noted, if the class is the result of a calculation, you may be tracing instructions back indefinitely.

If I remember correctly, FindBugs uses the BCEL internally, but I've never dug around in there to see exactly how clever they're being. If either of those teams have appropriate mailing lists, they may prove a better place to ask - they'll probably at least know if someone has been down this road before.

McDowell