views:

92

answers:

5

Original Question: Given a method I would like to determine if an object returned is created within the execution of that method. What sort of static analysis can or should I use?

Reworked Questions: Given a method I would like to determine if an object created in that method may be returned by that method. So, if I go through and add all instantiations of the return type within that method to a set, is there an analysis that will tell me, for each member of the set, if it may or may not be returned. Additionally, would it be possible to not limit the set to a single method but, all methods called by the original method to account for delegation?

This is not specific to any invocation.

It looks like method escape analysis may be the answer.

Thanks everyone for your suggestions.

A: 

I'm not sure if this would work for you circumstances, but one simple approach would be to populate a newly added 'instantiatedTime' field in the constructor of the object and compare that with the time the method was call was made. This assumes you have access to the source for the object in question.

btreat
Thanks, but source may not always be available.
A: 

Are you sure static analysis is the right tool for the job? Static analysis can give you a result in some cases but not in all.

When running the JVM under a debugger, it assigns objects with increasing object IDs, which you can fetch via System.identityHashCode(Object o). You can use this fact to build a test case that creates an object (the checkpoint), and then calls the method. If the returned object as an id greater than the checkpoint id, then you know the object was created in the method.

Disclaimer: that this is observed behaviour under a debugger, under Windows XP.

mdma
Static analysis is preferred even with loss of accuracy. However, I hadn't thought of your recommendation. Thanks.
@mdma - I'm pretty sure that identityHashCode values are initially derived from object addresses. That would mean that they are not monotonically increasing over time, even within a single thread. The ids will normally increase, but when the GC runs you are likely to see a big jump forwards or backwards.
Stephen C
@Stephen - normally that's the case, but I see under a debugger, the ids are far more predictable.
mdma
A: 

I have a feeling that this is impossible to do without a specially modified JVM. Here are some approaches ... and why they won't work in general.

The Static Analysis approach will work in simple cases. However, something like this is likely to stump any current generation static analysis tool:

// Bad design alert ... don't try this at home!
public class LazySingletonStringFactory {
    private String s;
    public String create(String initial) {
        if (s == null) {
            s = new String(initial);
        }
        return s;
    }
}

For a static analyser to figure out if a given call to LazySingletonStringFactory.create(...) returns a newly created String it must figure out that it has not been called previously. The Halting Problem tells us that this is theoretically impossible in some cases, and in practice this is beyond the "state of the art".

The IdentityHashCode approach may work in a single-threaded application that completes without the garbage collector running. However, if the GC runs you will get incorrect answers. And if you have multiple threads, then (depending on the JVM) you may find that objects are allocated in different "spaces" resulting in object "id" creation sequence that is no longer monotonic across all threads.

The Code Instrumentation approach works if you can modify the code of the Classes you are concerned about, either direct source-code changes, annotation-based code injection or by some kind of bytecode processing. However, in general you cannot do these things for all classes.

(I'm not aware of any other approaches that are materially different to the above three ... but feel free to suggest them as a comment.)

Stephen C
Why is this example hard? A static analyzer checking that the results of a "new" could reach a "return" statement would conservatively say "yes", and the OP understands he can't get a perfect answer from static analysis.
Ira Baxter
@Ira Baxter - that's the point. In a significant proportion of cases, a static analyser can only say "maybe". And that's not a useful answer ...
Stephen C
@ StephenC: Whether "maybe" is useful depends on what the OP wants to do with the answer. If he is going to light up a nuclear weapon based on the answer, then "maybe" isn't good enough. If he can do a mechanical compile/runtime hueristic optimization as a result, then "maybe" will possibly give him better code. If he is willing to investigate the code in question manually then "maybe" is a fine answer. Static analyzers sometimes give exact results, but in the face of most code they pretty much *only* give you maybe, and they're clearly useful.
Ira Baxter
A: 

Not sure of a reliable way to do this statically.

You could use:

  1. AspectJ or a similar AOP library could be use to instrument classes and increment a counter on object creation

  2. a custom classloader (or JVM agent, but classloader is easier) could be used similarly

jayshao
A: 

Your question seems to be either a simple "reaching" analysis ("does a new value reach a return statements") if you are interested in any invocation and only if a method-local new creates the value. If you need to know if any invocation can return a new value from any subcomputation you need to compute the possible call-graph and determine if any called function can return a new value, or pass a new value from a called function to its parent.

There are a number of Java static analysis frameworks.

SOOT is a byte-code based analysis framework. You could probably implement your static query using this.

The DMS Software Reengineering Toolkit is a generic engine for building custom analyzers and transformation tools. It has a full Java front end, and computes various useful base analyses (def/use chains, call graph) on source code. It can process class files but presently only to get type information.

If you wanted a dynamic analysis, either by itself or as a way to tighten up the static analysis, DMS can be used to instrument the source code in arbitrary ways by inserting code to track allocations.

Ira Baxter
You're spot on. I am in fact interested in any invocation. It seems like I'm asking after escape analysis. I'll see what Soot has to offer.